In New York City in the United States, a standard police report is required when a vehicle collision occurs where someone is injured, killed, or if there is at least $1,000 worth of damage. This data is collected and published online as an open data source at NYC Open Data and is updated daily.
The two datasets used are Crashes and Vehicles per the URL's below:
The scenario is to provide Power BI reports that are updated daily to provide descriptive analytics and insight into location, contributing factors, and vehicle types for vehicle collisions using Fabric and CoPilot. This solution also provides foundations for other personas like Data Analysts and Data Scientists for future work.
This solution does the following in a Fabric workspace:
Uses notebooks and PySpark to load intial CSV files as delta tables into a Lakehouse.
Uses Data Factory to pull collison and vehicles involved data on a daily basis via an API call to New York City's Open Data website and lands it in a Lakehouse as tables. Data can be explored via the SQL Analytics Endpoint or PySpark/Spark SQL using a notebook.
A standard Medallion Architecture is applied. The Lakehouse serves as Bronze, and two Warehouses serve for Silver and Gold. Via Data Factory, daily data is flowed through to Gold. CoPilot was used to help with creating some pipelines and for questions.
A semantic model built with SQL Views from Gold that uses Direct Lake mode.
Two seperate Power BI reports were created initially using CoPilot. The reports were then reviewed and amended for more effective reporting.
Project name
Daily NYC Vehicle Collision Reporting
Description
In New York City in the United States, a standard police report is required when a vehicle collision occurs where someone is injured, killed, or if there is at least $1,000 worth of damage. This data is collected and published online as an open data source at NYC Open Data and is updated daily.
The two datasets used are Crashes and Vehicles per the URL's below:
Crashes
Vehicles
The scenario is to provide Power BI reports that are updated daily to provide descriptive analytics and insight into location, contributing factors, and vehicle types for vehicle collisions using Fabric and CoPilot. This solution also provides foundations for other personas like Data Analysts and Data Scientists for future work.
This solution does the following in a Fabric workspace:
Two seperate Power BI reports were created initially using CoPilot. The reports were then reviewed and amended for more effective reporting.
Project Repository URL
https://github.com/cameron-thorne/mshackathon_nyc_collision/tree/main
Project video
https://github.com/cameron-thorne/mshackathon_nyc_collision/blob/main/Walkthrough/3_3_2024%2C%209_22_16%20PM%20-%20Screen%20-%20Walkthrough.webm
Team members
Callisto3 (David Niemeier)