derhuerst / gtfs-rt-differential-to-full-dataset

Transform a stream of DIFFERENTIAL-mode GTFS Realtime (GTFS-RT) FeedEntities into a FULL_DATASET-mode feed.
ISC License
3 stars 0 forks source link

strictly follow DIFFERENTIAL draft spec #1

Open derhuerst opened 4 years ago

derhuerst commented 4 years ago

We should carefully read the draft DIFFERENTIAL spec and make sure this package follow it. There are already too many homegrown almost-compatible GTFS-RT tools out there, this package shouldn't be another one!

derhuerst commented 4 years ago

Some comments from the draft Google doc:

Finally, the consumer should ensure that updates received together in a single FeedMessage become visible to an end user simultaneously. The user should never observe a state where some updates in a FeedMessage have been applied but others have not.

The GTFS-RT reference document states that the FeedEntity.id and FeedEntity.isDeleted fields are only relevant to incremental feeds. However, given the differential semantics outlined above there is not a clear need to explicitly delete RT messages (other than Alerts, see discussion below) as one can always simply provide an update that will supersede them.

TripUpdates: Both scheduled and added trips can be CANCELED by sending another TripUpdate message referencing the same trip ID. VehiclePositions: When a message is received saying the vehicle is STOPPED_AT the last stop in a trip, or saying that a vehicle has moved on to a different trip, its state with respect to that first trip is effectively removed. However, it might be helpful to provide a means to unambiguously signal that a vehicle has gone out of service. Alerts. Unlike the other two message types, multiple Alerts may accumulate to the same GTFS entity. Without alert IDs and deletion flags, it would not be possible to remove them in differential mode. We can completely sidestep the problem by simply not using differential Alerts. Alert datasets are generally small and do not benefit greatly from low update latency. They seem to be a good use case for the FULL_DATASET provider push combination.

I think these steps must be implemented for full compatibility:

good to know:

  1. Differential vs. Incremental. The reference and specification use both terms: differential and incremental. We should choose only one and use it consistently. These same terms are used in backup systems, and the behavior here seems closer to incremental backups where only individual files that have changed since the last backup (be it incremental, differential, or full) are included. In some sense, GTFS-RT as a whole is “differential” with respect to scheduled GTFS, and the new mode we are discussing is “incremental” with respect to a GTFS-RT FULL_DATASET.
derhuerst commented 3 years ago

https://github.com/opentripplanner/OpenTripPlanner/blob/4e17541d3213b9d443dd95f4e0bce079e1ac10d0/pom.xml#L598

derhuerst commented 2 years ago

The OTP docs also refer to this lib which implements the OTP-specific "GTFS-RT FeedEntitys over WebSockets" style: https://github.com/OneBusAway/onebusaway-gtfs-realtime-exporter

derhuerst commented 1 year ago

https://transport.data.gouv.fr/explore displays a VehiclePositions from a feed on a map. It receives them via a custom WebSocket-based protocol:

https://github.com/etalab/transport-site/blob/f83c05936883e4e851ca65db760b8e5b36886f27/apps/transport/client/javascripts/explore.js#L83-L90

derhuerst commented 3 weeks ago

https://github.com/opentripplanner/OpenTripPlanner/blob/8164d41b4f9ad7e5c946dfeec4e9dd567b38e6f5/src/main/java/org/opentripplanner/updater/trip/MqttGtfsRealtimeUpdater.java#L153-L163