The PowerBI dashboard fails to load after the data ingestion job runs a second time, due to a duplicate identifier in the dim_parking_bay table.
This is caused by the Parking Sensor E2E code assuming that Bay_Id is a unique identifier for a parking bay sensor. However, the bay data contains entries that have the same Bay_Id (perhaps due to the changes noted on the source website)?
Perhaps st_marker_id and bay_id comprise a compound key that's unique? This PR fixes the example so that it's functional (although potentially less accurate) by removing dupes in the bay_data.
During the ingestion job, the transform stage does a left join on the bay + sensor data using the bay_id. This join will duplicate rows when
Author pre-publish checklist
[X ] Added test to prove my fix is effective or new feature works
Validation steps
Run ingest job multiple times
Refresh PowerBI data - without the fix, it will yield the "duplicate value" error above. With the fix, the refresh works.
Alternative:
Without the fix, observe that the metrics show the number of parking bays growing with each successive ingestion run:
However, it remains constant (as expected) across runs after the fix is applied.
Kusto Query:
customEvents
| order by timestamp desc
| where name startswith "Transform (ai): Completed"
| project timestamp, appName, name,
pipelineName = customDimensions.pipeline,
pipelineRunId = customDimensions.run_id,
dimParkingbayCount = customDimensions.new_parkingbay_count
Type of PR
Purpose
The PowerBI dashboard fails to load after the data ingestion job runs a second time, due to a duplicate identifier in the dim_parking_bay table.
This is caused by the Parking Sensor E2E code assuming that Bay_Id is a unique identifier for a parking bay sensor. However, the bay data contains entries that have the same Bay_Id (perhaps due to the changes noted on the source website)?
Perhaps st_marker_id and bay_id comprise a compound key that's unique? This PR fixes the example so that it's functional (although potentially less accurate) by removing dupes in the bay_data.
During the ingestion job, the transform stage does a left join on the bay + sensor data using the bay_id. This join will duplicate rows when
Author pre-publish checklist
Validation steps
Alternative: Without the fix, observe that the metrics show the number of parking bays growing with each successive ingestion run:
However, it remains constant (as expected) across runs after the fix is applied.
Kusto Query: