Allow route inference from static data in trip updates

cedarbaum commented 1 year ago

Route IDs are not required to be provided in GTFS RT trip updates. One example of this is the NYC ferry (https://www.ferry.nyc/developer-tools/). In such cases, the route should be able to be inferred from the trip's static data. This PR allows this with the below changes:

Automatically recognize when route ids are not provided from the RT feed and then try to use static data instead
- Previously was a config option, but removed in course of PR review.
When above option is enabled, map the trip id to its route when updating realtime trips

jamespfennell commented 1 year ago

It's great to start working on this general idea of merging the static data with the realtime data. This is required for a bunch of other transit systems including the SF Bart, LIRR and MetroNorth I believe. For these ones we need more than route ID - I think we also need to import data from the stop times, because the realtime data just populates the delay field. But obviously that will be a different PR, just wanted to comment there is other work in this area.

The PR looks great. My only comment is that I think we should not have an additional config parameter, and that the update logic should automatically detect if it needs to fetch static data, as you mention in the description. I think this won't be that hard - we just need to loop over the data once and get the trip IDs whose route ID is missing? Does that sounds reasonable? The idea is that users of Transiter shouldn't have to peek into the data and figure out what's going on and see what configs they need to set - it should work out of the box.

cedarbaum commented 1 year ago

This make sense! I've updated the PR to automatically detect these cases and fallback to static data.

I also handled the case where there is mixed data (e.g., some trips have route IDs and others don't). This is probably rare, but not too much more complicated to handle I don't think. It did involve updating the test infrastructure to verify multiple trips per-update (see "some trips with route id, some without" as an example) which did make the test code diff a bit messier unfortunately. I think it also relates to your other comment in #122 about potentially unifying the trip and vehicle test code, which are now even closer in structure.

jamespfennell commented 1 year ago

Super, looks great!

jamespfennell / transiter

Allow route inference from static data in trip updates #121