google / transit

https://gtfs.org/
Apache License 2.0
590 stars 182 forks source link

TripDescriptor.start_date matching between GTFS-RT + GTFS-static #381

Open felixguendling opened 1 year ago

felixguendling commented 1 year ago

Hi everyone!

I have a question regarding the TripDescriptor.start_date. Currently, the documentation tells me this:

https://gtfs.org/realtime/reference/#message-tripdescriptor

Two questions:


Example:

calendar_dates.txt

service_id,date,exception_type
S,20190331,1

trips.txt

route_id,service_id,trip_id,trip_headsign,block_id
R_RE1,S,T_RE1,RE 1,1
R_RE2,S,T_RE2,RE 2,1

stop_times.txt

trip_id,arrival_time,departure_time,stop_id,stop_sequence,pickup_type,drop_off_type
T_RE1,00:00:00,00:00:00,A,1,0,0
T_RE1,48:30:00,48:30:00,B,2,0,0
T_RE2,48:30:00,72:30:00,B,1,0,0
T_RE2,72:30:00,96:30:00,C,2,0,0

Now the first departure time of trip T_RE2 would be on

from service_id UTC First Departure translated with offset at noon -02:00 Local First Departure
2019-03-31 2019-04-01 2019-04-02

I would guess that GTFS-RT is supposed to give me the date from service_id (2019-03-31) to refer to the trip. However, both other options would also be reasonable in a way.


I have not been able to get this information from the specification. Can someone please clarify how to handle these cases?

derhuerst commented 1 year ago

(This is not Google's or MobilityData's position on this, merely my personal interpretation of the spec and the surrounding materials & discussions.)

Some fields in the specification are explicitly specified to be in UTC. For the rest I can assume that they refer to the GTFS-static local times?

Side note: As discussed very recently in https://github.com/google/transit/issues/322, there is a different between agency_timezone and stop_timezone.

I would assume that all Time fields in GTFS that can be tied to an agency (e.g. stop_times.*, frequencies.*) are "relative to" the timezone specified in agency_timezone.


  • What happens in case the first departure time is greater than 24:00:00? The start date is the one specified in the service_id or the "real" start date (which would be for example on the next day)?

I would expect a TripDescriptor.start_time to always refer to the service day of the scheduled trip "run" in the corresponding GTFS Schedule dataset.

Let's consider a specific example: If there is a scheduled trip t1 with a first departure of 26:00:01 that runs on service days 20230606 ("run" A, effectively starting at 2023-06-07T02:00:01) and 20230607 ("run" A, effectively starting at 2023-06-08T02:00:01), I would expect a GTFS-RT TripDescriptor

If my assumptions are right, then technically the TripDestriptor.start_{date,time} merely describe a local (as in "wall clock time") date+time, as the timezone is only indirectly implied via the GTFS Schedule dataset's agency_timezone. Therefore, start_{date,time} are to be treated rather like a foreign key uniquely referencing a single "run" in the GTFS Schedule dataset.

felixguendling commented 1 year ago

Thank you very much @derhuerst! I think it absolutely makes sense if you interpret start_date more like a database key (derived from service_id via calendar_dates.txt and calendar.txt entries, no matter if the first departure from stop_times.txt is greater or smaller than 24:00:00) and not like a real date identifying the day on which the trip has its first departure (in whatever timezone). It might make sense to add a clarification to the standard and leave the issue open until then?