CUTR-at-USF / gtfs-realtime-validator

Java-based tool that validates General Transit Feed Specification (GTFS)-realtime feeds. See https://github.com/MobilityData/gtfs-realtime-validator for the latest!
Other
92 stars 40 forks source link

The batch file documentation should recommend validating only one RT file type at a time. #411

Open evansiroky opened 2 years ago

evansiroky commented 2 years ago

Summary:

I looked in the code and realized that some validation rules look back at previous messages, and so it occurred to me that this implies validating from the same stream of RT file types.

Steps to reproduce:

If multiple file types are validated at the same time, they might all have certain header timestamps that could result in certain timestamp validation rules being triggered or not being triggered.

Expected behavior:

The batch file documentation should recommend validating only one RT file type at a time.

Observed behavior:

The batch file documentation does not recommend validating only one RT file type at a time.

Platform:

https://github.com/cal-itp/gtfs-rt-validator-api/

barbeau commented 2 years ago

@evansiroky Thanks for pointing this out. I agree that the time dependency of some rules could be called out more explicitly.

IIRC the basic assumption we made was that all files in a directory being validated would come from the same feed stream. So if you're mixing feeds (e.g., different .pb file sources) in the same directory it could cause issues.

Also, note that there is a -sort parameter that controls if the file name or date is used as the "current" time for these rules: https://github.com/CUTR-at-USF/gtfs-realtime-validator/tree/master/gtfs-realtime-validator-lib#command-line-config-parameters.

Also FYI, the validator should support mixed feeds, where you have multiple entity types in the same PB file (e.g., VehiclePosition and TripUpdates).