transitmatters / mbta-performance

For processing performance data for the data dashboard
MIT License
0 stars 0 forks source link

Remove events with null stop ids #17

Closed hamima-halim closed 2 months ago

hamima-halim commented 2 months ago

Quick fix for https://github.com/transitmatters/mbta-performance/issues/16, which is 1 of 2(+?) data qual issues we're running into. This PR drops all events with null stop id's because it messes with our schedule merging algorithms. 2 reasons this is ok:

  1. These phantom events aren't even read from on s3 atm because the events data is partitioned, in part, by stop id
  2. Many of these events are a phantom departure that occurs before the first stop of a route, so we actively don't care about tracking it.

That being said: some of these null-stop events can also be AVL blips, etc, which we might want to look into more at a later date.

Uploaded some data from my laptop to s3, beta dashboard is looking a bit healthier now! (: https://dashboard-beta.labs.transitmatters.org/green/trips/single/?date=2024-05-08&to=70257&from=70510 https://dashboard-beta.labs.transitmatters.org/blue/trips/single/?date=2024-05-08&to=70039&from=70057

hamima-halim commented 2 months ago

Looks good for the Blue and Green lines. Red and Orange line still look wrong (red is missing a lot of data altogether) but this is an improvement

that makes sense--those would be the lines affected by the alternate stop_id issue (: