Closed derhuerst closed 1 year ago
On Saturday, June 6, 2020 8:05:27 PM CEST, Jannis Redmann wrote:
Is my understanding of the semantics correct? If it is, I'd argue that this is quite unintuitive and therefore easy to implement in a wrong way. If I misunderstood how
frequencies.txt
works, let's improve the documentation.
Could you give an example how it could be interpreted in a wrong way?
Let's consider an excerpt from the example feed linked in the spec:
trip_id
arrival_time
departure_time
stop_id
stop_sequence
pickup_type
drop_off_type
AWE1
0:06:10
0:06:10
S1
1
0
0
AWE1
S2
2
1
3
AWE1
0:06:20
0:06:30
S3
3
0
0
AWE1
S5
4
0
0
AWE1
0:06:45
0:06:45
S6
5
0
0
trip_id
start_time
end_time
headway_secs
AWE1
05:30:00
06:30:00
300
If I assume that all stop_times.txt
/frequencies.txt
for AWE1
describe one "run" of one vehicle that I can use continuously, then I could conclude that I can stay in the vehicle from 05:30:00
(earliest start in time frame) until 7:00:00
(latest start in time frame + 35min). This is not the case I assume?
It is not one run (or block). It is a normalisation form of transit data. Including a confidence interval of the arrival time of the next trip. Remaining in the vehicle for no particular reason is an activity that is probably allowed if you would have a day ticket, but that is not what this structure (or GTFS) explicitly defines.
It is not one run (or block).
(Not sure what exactly you mean by "run" here, but I will assume you mean what I tried to explain.)
In the GTFS ecosystem, I have often observed the assumption that one GTFS trip
corresponds to exactly one "run". Or in plain English: That one GTFS trip
means that one vehicle will continuously visit all stops in the trip
, without any other trips in between and without additional stops before or after; That after the vehicle has visited all stops in the trip
, the "run" is "over".
Making that assumption would probably lead to routing errors (e.g. routes that I actually can't take or that are physically impossible) & unintuitive UIs (e.g. showing the first stop of the trip between other later stops, because another "run" in a time frame of compressed data has started).
If this assumption is not to be made, meaning the stop_times
/frequencies
feature of GTFS is purely a "normalisation form" to describe when & where any appropriate vehicle of a line will stop, IMO we should clarify this better in the documentation.
(All of this does of course not apply anyways to different schemes of sending vehicles around, like circle-based lines or lines split up by direction.)
GTFS Best Practices offer the below.
Field Name Recommendation block_id Can be provided for frequency-based trips.
So, that means the following example is valid, and indicates a continuous loop where passengers can stay onboard at stop_A
.
trip_id | arrival_time | departure_time | stop_id | stop_sequence |
---|---|---|---|---|
trip_1 | 06:10:00 | 06:10:00 | stop_A | 1 |
trip_1 | 06:15:00 | 06:15:00 | stop_B | 2 |
trip_1 | 06:20:00 | 06:20:00 | stop_C | 3 |
trip_1 | 06:25:00 | 06:25:00 | stop_D | 4 |
trip_1 | 06:30:00 | 06:30:00 | stop_E | 5 |
trip_1 | 06:35:00 | 06:35:00 | stop_F | 6 |
trip_1 | 06:40:00 | 06:40:00 | stop_A | 7 |
route_id | trip_id | service_id | block_id |
---|---|---|---|
red_loop | trip_1 | weekday | red_loop_block |
trip_id | start_time | end_time | exact_times | headway_secs |
---|---|---|---|---|
trip_1 | 6:10 | 18:40 | 1 | 1800 |
headway_secs
was less than the duration of trip_1
(30 min) then this would be an error. The spec doesn't support that right now.exact_times=1
trips defined in frequencies.txt
should be treated the same way as trips defined in a GTFS that doesn't include the frequencies.txt
file - you just "unroll" the pattern defined in stop_times.txt
into individual trips from the start to end time defined in frequencies.txt, with the start time for each individual trip being headway_secs
apart. Note that then arrival_time
and departure_time
in this case don't refer to absolute times, but rather exist to define the travel time between each stop in the trip. I agree that the documentation could be improved, including examples, to make this clearer.
Note there is another open proposal to better define in-seat transfers and transfer rules at https://github.com/google/transit/pull/32.
exact_times=1
trips defined infrequencies.txt
should be treated the same way as trips defined in a GTFS that doesn't include thefrequencies.txt
file - you just "unroll" the pattern defined instop_times.txt
into individual trips [...].
Okay, thanks for clarification.
In this case, I advocate to state clearly in the documentation that one trip_id
does not correspond to one "run" (which I tried to define above). From my subjective experience, this seems to be a quite natural assumption.
@derhuerst : I see what you mean. Perhaps a future modification to the spec or training materials could clarify this.
GTFS was created originally with passenger-facing applications in mind, so a "trip" refers to when a vehicle operates on a route. In passenger-facing information, that usually looks like a row on a timetable.
Operational schedules have runs, which would usually consist of multiple "trips" in the passenger-centric sense of GTFS.
Some (non-standard) GTFS datasets do include information on "runs" as you're thinking of them. Discussion in issue #195. Here is an example runcut.txt file: https://openmobilitydata.org/p/ventura-county-transportation-commission/792/latest/file/runcut.txt
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Keep open.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been closed due to inactivity. Issues can always be reopened after they have been closed.
I'm currently looking at creating a frequencies.txt file for a rail operation. Having the main reference point as trip_id really threw me for a loop as that is not what I had expected to see there. After looking at this for a while, I determined that it uses a trip_id to pull the required data which I believe is route, stop sequence, and the running time. I think this should be clarified as it would save a lot of trial and error for other users.
@huntrob consider this trip_id some kind of hash result using the same stop sequence, times between them and calendar. Then this template can be instantiated at different times.
I have a question about the semantics of
frequencies.txt
withexact_times=1
.From my experience with GTFS and my observations of GTFS-based UIs, a lot of tools seem to make the following assumption:
A single trip (as defined by a unique
trip_id
in the GTFS dataset) is one vehicle (or group of vehicles) that I can use without significant interruptions (such as waiting for another vehicle or chasing to a different line). Often, a single trip is considered to be a vehicle which I can travel with for the whole duration of thetrip
.The
frequencies.txt
documentation seems to undermine that assumption however:I think this is especially important for routing engines: Now, they can't assume anymore that every GTFS data point referring to the same
trip_id
is tied to one vehicle allowing continuous travel. There are >=1 "runs" of a vehicle, all under the sametrip_id
, but each of them ends at the last stop specified instop_times.txt
.Is my understanding of the semantics correct? If it is, I'd argue that this is quite unintuitive and therefore easy to implement in a wrong way. If I misunderstood how
frequencies.txt
works, let's improve the documentation.