MobilityData / gtfs-realtime-validator

Java-based tool that validates General Transit Feed Specification (GTFS)-realtime feeds
Other
41 stars 10 forks source link

Should StopTimeUpdates be propagated across trips in the same block? #27

Open barbeau opened 2 years ago

barbeau commented 2 years ago

Issue by barbeau Monday Mar 27, 2017 at 17:14 GMT Originally opened as https://github.com/CUTR-at-USF/gtfs-realtime-validator/issues/90


Summary:

To my knowledge, currently it's not clear whether or not consumers should propagate StopTimeUpdates across trips in the same block. It IS clear that updates should be propagated within the same trip.

See my comment on item 4 here: https://groups.google.com/d/msg/gtfs-realtime/Ua8f2AFQ9U4/EDTPDuEcAgAJ

Excerpt:

In my option the best practice is to propagate delays down the block, but being sure to honor any layovers (i.e., a layover might be able to absorb small delays, which would result in the vehicle departing the layover stop on time). IMHO early arrivals also shouldn't be propagated past stops with timepoint=1, although neither of these behaviors are currently specified in GTFS-rt, and as a result consumers will have different behavior.

For example, OneBusAway (https://onebusaway.org/) propagates delays across blocks in the same trips, but OpenTripPlanner (http://www.opentripplanner.org/) does not. We're using OTP for trip planning within OneBusAway, and we'd like to change OTP to match the behavior of OBA and propagate delays across trips in the same block. We're finding that often we know a bus is running really late (e.g., 20 minutes), and a user tries to plan a journey for a trip further down the block, but OTP prevents that real-time delay from showing up in the user's trip plan until the vehicle actually running that trip. So we show no real-time info until only a few minutes before the bus will actually arrives, at which point we suddenly show a huge delay.

I don't think this affects our validator at all, but I wanted to capture this here with the "GTFS-rt spec clarification" so I can revisit this in the GTFS-rt community.

barbeau commented 2 years ago

Comment by barbeau Wednesday Oct 10, 2018 at 14:39 GMT


I've drafted an diagram to help explain this issue - https://docs.google.com/presentation/d/1bLV8QaQkGARX1g1ThdQr9mf7oOCGs3db2VH9TsZ-tgw/edit

propagating delays across trips in same block 1

barbeau commented 2 years ago

Comment by barbeau Thursday Oct 11, 2018 at 16:40 GMT


I was looking at how OneBusAway handles early arrivals for Trip A (i.e., a negative delay value).

ArrivalAndDepartureServiceImpl line 899 says:

  private int propagateScheduleDeviationForwardWithSlack(int scheduleDeviation,
      int slack) {
    /**
     * If the vehicle is running early and there is slack built into the
     * schedule, we guess that the vehicle will take that opportunity to pause
     * and let the schedule catch back up. If there is no slack, assume we'll
     * continue to run early.
     */
    if (scheduleDeviation < 0) {
      if (slack > 0)
        return 0;
      return scheduleDeviation;
    }

    /**
     * If we're running behind schedule, we allow any slack to eat up part of
     * our delay.
     */
    return Math.max(0, scheduleDeviation - slack);
  }
barbeau commented 2 years ago

Comment by barbeau Thursday Oct 18, 2018 at 19:22 GMT


GTFS-rt proposal now open at https://github.com/google/transit/pull/110.

barbeau commented 2 years ago

Comment by minhhpham Monday Oct 29, 2018 at 15:03 GMT


I added a rule to this branch to check whether a trip_update contains only one stop_time_update and found 28 agencies do this:

FEED LOCATION
ART Trip Updates Arlington, VA, USA
Metro Transit Trip Updates Madison, WI, USA
OCTA Trip Updates Orange County, CA, USA
Thunder Bay Transit Trip Updates Thunder Bay, ON, Canada
GRT Trip Updates Waterloo, ON, Canada
HART Trip Updates Tampa, FL, USA
OVapi Trip Updates The Netherlands
Big Blue Bus Trip Updates Santa Monica, CA, USA
Saskatoon Transit Trip Updates Saskatoon, SK, Canada
BURT Trip Updates Burlington, ON, Canada
LTD Trip Updates Eugene, OR, USA
CT Transit Hartford Trip Updates Connecticut, USA
RIPTA Trip Updates Providence, RI, USA
People Mover Trip Updates Anchorage, AK, USA
Metro Transit Trip Updates Saint Louis, MO, USA
Capital Metro Trip Updates Austin, TX, USA
Barrie Transit Trip Updates Barrie, ON, Canada
VIA Trip Updates San Antonio, TX, USA
ETS Trip Updates Edmonton, AB, Canada
Luxembourg Trip Updates Luxembourg
YRTViva Trip Updates York, Toronto, ON, Canada
MetroTransit Trip Updates Halifax, NS, Canada
Kingston Transit Trip Updates Kingston, ON, Canada
Votran Trip Updates Daytona Beach, FL, USA
MST Trip Updates Monterey, CA, USA
BART Trip Updates San Francisco, CA, USA
NYC Ferry Trip Updates New York, NY, USA
MBTA Trip Updates Boston, MA, USA
barbeau commented 2 years ago

Comment by barbeau Wednesday Oct 31, 2018 at 16:03 GMT


@minhhpham could you add a link to snippets Gist here too? Also, please review the data and mark any agencies that have more than one stop_time_updates. I know that MBTA, and I think OVapi, have multiple stop_time_updates for some (maybe most) trips. The tool may have flagged some of these as having only one stop_time_update because a single trip had only one stop_time_update, but other trips could have had more.

barbeau commented 2 years ago

Comment by minhhpham Friday Nov 02, 2018 at 18:06 GMT


The previous comment lists agencies with at least 1 trip_update that contain only one stop_time_update. Here is the list of agencies with all trip_update that contain only one stop_time_update.

FEED LOCATION
OCTA Trip Updates Orange County, CA, USA
HART Trip Updates Tampa, FL, USA
Big Blue Bus Trip Updates Santa Monica, CA, USA
People Mover Trip Updates Anchorage, AK, USA
Metro Transit Trip Updates Saint Louis, MO, USA

I also updated the validator branch.

barbeau commented 2 years ago

Comment by minhhpham Friday Nov 02, 2018 at 18:16 GMT


Here's the Gist: https://gist.github.com/minhhpham/1f3baa8715edf859087ba1dd061b2cb1

barbeau commented 2 years ago

Comment by barbeau Wednesday Nov 07, 2018 at 15:29 GMT


@minhhpham Thanks for pulling this together!

I think we need one more piece of data here, which is how many agencies are providing feeds that give more than one trip_update simultaneously for the same vehicle_id in the TripUpdate feed (just looking at the vehicle_id within the trip_update should be sufficient). The propagation issue described in this issue doesn't apply to those agencies (or at least it isn't as bad - ideally we'd check to see what % of vehicles are providing trip_updates for more than one trip simultaneously, but I think that's harder to track given our current tool. Unless maybe we write the rule to log errors for feeds with less than X% of vehicles having more than one trip_update at a time. Also, this issue is the biggest problem when a vehicle gets to the end of the trip - it may be ok to only have only one trip_update for a vehicle when the vehicle is at the very beginning of the trip, depending on the trip length. Again, this gets more complicated though).

barbeau commented 2 years ago

Comment by minhhpham Wednesday Nov 07, 2018 at 18:27 GMT


For the % of vehicles, I think we can just count the total number of trip_update and the number of unique vehicle_id then write them to a text file. This can be done within the TripDescriptorValidator.java, the only problem is that I cannot get agency name when running that script. So the output will be 2 columns, total number of trip_update and number of unique vehicle_id.

barbeau commented 2 years ago

Comment by barbeau Wednesday Nov 07, 2018 at 20:05 GMT


Within the validate() method you should be able to get agency names for the GTFS data with:

        for (Agency agency : gtfsData.getAllAgencies()) {
            agency.getName();
        }

It's probably enough to just get the agency name for the first agency and write it as the file name for that data.

barbeau commented 2 years ago

Comment by minhhpham Thursday Nov 08, 2018 at 01:16 GMT


Here are 18 agencies with duplicated vehicle_id (i.e., more than one trip_update per vehicle simultaneously):

Agency Total vehicle_id Duplicated vehicle_id Percentage
Arlington Transit 75 34 45%
Metro Transit-City of Madison 150 54 36%
Orange County Transportation Authority 391 2 1%
Thunder Bay Transit 84 66 79%
Regional Transportation Authority of Middle Tennessee 151 53 35%
Big Blue Bus 303 148 49%
Saskatoon Transit 128 57 45%
Shore Line East 266 119 45%
Rhode Island Public Transit Authority 145 59 41%
Metro St. Louis 239 4 2%
Capital Metro 514 231 45%
Edmonton Transit Service 903 346 38%
York Region Transit 449 179 40%
Halifax Transit 171 72 42%
Kingston Transit 132 98 74%
Votran 94 66 70%
Monterey-Salinas Transit 130 63 48%
MTA New York City Transit 30 12 40%
barbeau commented 2 years ago

Comment by barbeau Thursday Nov 08, 2018 at 01:37 GMT


Thanks! And what's the total number of agencies analyzed? The remainder are the ones this issue applies to.

barbeau commented 2 years ago

Comment by minhhpham Thursday Nov 08, 2018 at 07:12 GMT


Total is 46 agencies, so 28 don't have more than one simultaneous trip_update per vehicle.

barbeau commented 2 years ago

Comment by barbeau Thursday Nov 08, 2018 at 16:55 GMT


@minhhpham Ok, thanks, this is great. However, I think there is one last corner case that we need to address. Of the remaining 28 that don't have more than one simultaneous trip_update per vehicle (based on the above analysis), this could be due to the vehicle_id field being omitted from the TripUpdate feed (I assume - let me know if you took this into account), because vehicle_id is optional in TripUpdates.

Could you please do one more pass to see how many agencies are missing vehicle_id in trip_updates?

EDIT - I changed the "aren't" to "are" in the above question.

barbeau commented 2 years ago

Comment by minhhpham Thursday Nov 08, 2018 at 17:54 GMT


The validator that I wrote actually provides this information because it spits out the total number of vehicle_id for each agency, so I only needed to count how many of them have 0 vehicle_id. There are 8 of them.

barbeau commented 2 years ago

Comment by barbeau Thursday Nov 08, 2018 at 20:35 GMT


Ok, thanks! So that would mean that 20 out of the 46 agencies (43%) are definitely not providing more than one simultaneous trip_update per vehicle (based on the time of day analyzed). The 8 without vehicle_ids may or may not be providing this - it's impossible to tell without cross-referencing the GTFS and doing a more in-depth analysis.

barbeau commented 2 years ago

Comment by minhhpham Thursday Nov 08, 2018 at 22:03 GMT


That's correct.