cityofaustin / atd-data-tech

Austin Transportation Data & Technology Services
17 stars 2 forks source link

School Zone Beacons dataset from ODP captures unique beacons in multiple rows #18332

Closed atdservicebot closed 3 months ago

atdservicebot commented 3 months ago

What application are you using?

Other / Not sure

Describe the problem.

I've been using the School Zone Beacons dataset from Open Data and I noticed that there are 2,051 rows of data. The issue is that there should only be 729 rows of data (each row representing a single beacon). I think new rows are being created every time we edit a beacon's details. That is only a guess though.

Is there anything else we should know?

Hoping to get this resolved within the next two weeks are we are aiming to wrap up our school zone beacon upgrades before the new school year starts.

Website Address

https://data.austintexas.gov/Transportation-and-Mobility/School-Zone-Beacons/mzsm-hucz/about_data

Internet Browser: Internet Explorer

Requested By Samir S.

Request ID: DTS24-116315

ChristinaTremel commented 3 months ago

@Charlie-Henry and @chiaberry I checked the API view of this dataset and there should only be 729 rows of data like Samir said! Could we check on how data is being inserted into this Socrata dataset?

chiaberry commented 3 months ago

I'll check if the dataset has a unique identifier set up in socrata

chiaberry commented 3 months ago

@ChristinaTremel the socrata dataset is missing a row identifier. we can fix the dataset later today. I am going to set the School Zone Beacon ID as the row identifier.

ChristinaTremel commented 3 months ago

Oh perfect! I think we changed the ID back in March so that's probably how it is missing the row identifier! Using the School Zone Beacon ID is perfect and I'll go ahead and assign this to you! Thank you. ✨

chiaberry commented 3 months ago

okay turns out that the school zone beacon ID didnt work as the row identifier because there are records for the same school zone beacon ID, one going EB and the other WB. I added the knack record id as a column to the private dataset and set that as the row identifier.

chiaberry commented 3 months ago

@ChristinaTremel I'll keep my eye on this over the next few ETL updates to make sure the records arent duplicating

ChristinaTremel commented 3 months ago

This was resolved and rows are no longer duplicating.