tulsawebdevs / django-multi-gtfs

Django app to import and export General Transit Feed Specification (GTFS)
http://tulsawebdevs.org/
Apache License 2.0
50 stars 32 forks source link

Amend importgtfs.py to ignore extra columns in feed files #10

Closed araichev closed 10 years ago

araichev commented 10 years ago

Some feeds e.g. Adelaide, Australia (http://www.gtfs-data-exchange.com/agency/adelaide-metro/) have extra columns in their feed files e.g. 'wheelchair_accessible' in trips.txt. Currently importgtfs.py fails on such feeds. It would be more useful, i reckon, for importgtfs.py to simply ignore these extra columns and import the rest of the columns as usual.

jwhitlock commented 10 years ago

I've been following OpenTripPlanner development, and extra columns are a common mechanism for extending GTFS (I'd prefer different files, but no one asked me). So, I'd like multigtfs to be able to round trip these extra columns as well. I'm thinking we could store extra data in a JSON-encoded text column, or in an extra key-value table.

But, I agree, a quick first step would be to ignore them.

slai commented 10 years ago

For what it's worth, 'wheelchair_accessible' was added to trips.txt in the GTFS spec (https://developers.google.com/transit/gtfs/reference#trips_fields) in 2012 (https://groups.google.com/forum/#!msg/gtfs-changes/xyPh5stQ8o4/ATa1nQZLcb8J).

jwhitlock commented 10 years ago

Sadly, they don't bump the date at the top of the file when they add columns. I've been looking at that and assuming I'm up to date. I'm not sure what the Feb 17th additions are. Anyway, opened issue #23 .

jwhitlock commented 10 years ago

They document these new optional fields on a different page: https://developers.google.com/transit/gtfs/changes#RevisionHistory .

The 'wheelchair_accessible' field has been added in v0.3.3. However, Adelaide's latest feed still fails to import - next crash is a RouteGroup column in the route. So, leaving the ticket open until the 'ignore' feature ships/

Permafacture commented 10 years ago

A Json encoded field makes sense to me. Keep it in the database and let any users use the extra fields they import. If a field ever becomes officially recognized, users could migrate the data rather than dumping/reimporting.

jwhitlock commented 10 years ago

Extra data is stored as strings in the new extra_data JSON field as a key-value dictionary. The feed has a new meta JSON field, which says which models contain additional columns. Extra columns and data appear in the exported GTFS feed. Scheduled for the 0.4.0 release.