transitmatters / mbta-performance

For processing performance data for the data dashboard
MIT License
1 stars 1 forks source link

LAMP stops id's slightly vary from those used by GTFS #15

Open hamima-halim opened 4 months ago

hamima-halim commented 4 months ago

The following stop_id's are report by LAMP

[   'Braintree-01',    'Braintree-02', 'Forest Hills-01', 'Forest Hills-02',
    'Oak Grove-02',    'Oak Grove-01', 'Union Square-02',  'Union Square-01']

GTFS, on the other hand, uses the numerical ID's for these stops in the stops.txt schedule (e.x., 70036 instead of Oak Grove-01). This messes up some of the GTFS-based merges we have to do on the stop_id column in this function https://github.com/transitmatters/mbta-performance/blob/a03170700e41d55b7e43d771b54787249f7c46b5/mbta-performance/chalicelib/lamp/ingest.py#L159

We need a map that translates LAMP's versions of these ID's to their numerical counterparts. Its pretty important because these are the first/last stops of their respective lines, and this can really screw over the route_starts calculations in this line https://github.com/transitmatters/mbta-performance/blob/a03170700e41d55b7e43d771b54787249f7c46b5/mbta-performance/chalicelib/lamp/ingest.py#L170

hamima-halim commented 4 months ago

So, the stop_id that LAMP reports isn't consistent for these stops--sometimes the downstream processes will resolve the vehicle location's stop_id to the numeric id (ie, 70001) and sometimes they will use the alternate string one (ie, Forest-Hills-02)

hamima-halim commented 4 months ago

(Thanks to @JNuss71 for walking through this with me!)

https://github.com/transitmatters/mbta-performance/pull/18 adds a map that takes the Alpha stop_ids to their numeric counterparts (notice that in Union Square, tracks 1 and 2 actually differ.) This fixes issues with the Orange line, but trips starting from Alewife and Union Square are still showing odd behavior. For better or worse, this is also consistent with the behavior coming out of the GTFS-realtime API and indicates that whatever is coming out of the AVL's (and their light downstream processing) isn't guaranteed to be consistent with what's expected from GTFS's stop/trip information.

hamima-halim commented 4 months ago

Alewife (Alewife-01, Alewife-02) trips are still behaving oddly, even outside of the fact that all southbound trips on the Red line during this planned outage aren't assigned a trip_id. I suspect theres some extra stop_id edge case we aren't picking up correctly that affects Alewife but not Braintree.

image