google / transitfeed

A Python library for reading, validating, and writing transit schedule information in the GTFS format.
https://github.com/google/transitfeed/wiki
Apache License 2.0
680 stars 253 forks source link

False positive error when trip_id has a leading space #381

Closed dnhlms closed 10 years ago

dnhlms commented 10 years ago

Forget the validity of a leading space in an identifier for a moment. If a trip in trips.txt is named ' 0706' and that same trip exists in stop_times.txt as ' 0706' (leading space in both), the feedvalidator will report that there is a stop_times trip that doesn't exist in trips. That is incorrect. Turning this into a warning might be a good idea.

Example output when this happens: Invalid value 0706 in field trip_id This value wasn't defined in trips.txt in line 132 of stop_times.txt trip_id arrival_time departure_time stop_id stop_sequence stop_headsign pickup_type drop_off_type shape_dist_traveled 0706 7:10:30 7:10:30 56 4 None None None None

bdferris commented 10 years ago

Agreed, this seems like a bug. The spec implies that you should remove such white-spaces in field values ("Remove any extra spaces between fields or field names. Many parsers consider the spaces to be part of the value, which may cause errors.") but I know that in practice, both the Google GTFS parser and the OneBusAway GTFS parser both strip whitespace from values, even if it's enclosed with in double-quotes. I think it's reasonable to the transitfeed parser match that behavior.