chihacknight / chn-ghost-buses

"Ghost buses" analysis project through Chi Hack Night
https://github.com/chihacknight/breakout-groups/issues/217
MIT License
19 stars 14 forks source link

[Data] Investigate the Fullerton bus more #41

Open lauriemerrell opened 1 year ago

lauriemerrell commented 1 year ago

In an early EDA session, we observed that the realtime API data for the Fullerton (74) bus had some trips with missing/non-distinct trip_id values that were a series of asterisks (like ******). At the time the issue did not seem too widespread, but the Fullerton bus is in our bottom 10 routes in terms of performance. It is probably worth taking a second look to see whether this data issue is causing the 74 to seem worse than it actually is.

Goals for this ticket:

KyleDolezal commented 1 year ago

From October through December, 2022, two bus routes include the missing trip ** value. Both the 66 and the 74 bus feature such missing trips.

The 66 bus had 857 missing trips and 327,759 non-missing trips. Missing trips constituted .2% of all scheduled trips.

The 77 bus had 19,451 missing trips and 176,375 non-missing trips. Missing trips were a total of 9.9% of total scheduled trips.

csklare101 commented 1 year ago

From chi hack night 7/25/23. Comments said that the issue exists still on CTA side at the time. There was a consideration in looking at CTA data directly to get more accurate real time data. Compare what this data would bring over, to see if its not a series of asterisks.