jamespfennell / transiter

Web service for transit data
https://demo.transiter.dev
MIT License
64 stars 7 forks source link

[GTFS Full Support] Expose the schedule in the HTTP API #11

Open jamespfennell opened 5 years ago

jamespfennell commented 5 years ago

Probably through two endpoints:

This will need a ton of GET based optionality to be useful. For example should be able to specific which times the schedule is wanted for.

jamespfennell commented 5 years ago

The New York/New Jersey PATH train has only schedule data, so it would make an excellent example system for this work.

jamespfennell commented 4 years ago

...or should the data just be in the usual location? Realtime data and static are supposed to be shown together. How will consumers know which static entries should be ignored as having been superseded by a realtime entry?

jamespfennell commented 4 years ago

All static entries should have a realtime_trip field, this is how we can tell entries that have been "stolen" by the realtime feed.

For v1 just create a separate endpoint and rely on consumers merging the data themselves.

jamespfennell commented 1 year ago

https://github.com/jamespfennell/transiter/pull/119 adds the static data into the database, so we could start working on this now if we wanted.

I think we should design the API in advance before doing the work, because I don't think it's obvious what the API here should be. For Transiter's existing entities (stops, realtime trips, vehicles etc.), the API essentially just returns the data from the relevant GTFS feeds. For scheduled trips this is not quite right though. The question we want the API to answer is: "what scheduled trips are running right now?" Just returning a list of all the scheduled trips doesn't answer the question (unless we require callers to calculate it themselves).

One idea would be to generalize the Trip message in the API so that it can represent both a realtime trip and a scheduled trip that is currently running. We would add a source field to the trip whose value is either REALTIME or SCHEDULE. In the handler for, say, ListStop, in addition to returning the realtime trips we would also the scheduled trips that will call at the stop in the next N minutes (N can be passed in the API).

I think this is sort of how GTFS is supposed to work. The schedule is provided in GTFS static fields, and in the absence of realtime data the schedule is taken as the source or truth. When realtime data is provided, each realtime trip is supposed to be associated to a scheduled trip - the realtime trip "supersedes" the scheduled trip in a sense, and becomes the source of truth. In the Transiter API we would return (a) all realtime trips and (b) all scheduled trips that haven't been replaced by a realtime trip.