Closed drewda closed 7 years ago
I've forked Stephen-Gates GTFS JSON Table Schema and started to generalize it for all GTFS consumers: https://github.com/CUTR-at-USF/GTFS/blob/full-spec/datapackage.json
His schema was specific to South East Queensland and had a number of required fields to ensure their data was populated, but these should be optional when matching to the current spec. He was also missing a number of fields, and a few tables still need to be filled in.
TODO for datapackage.json:
Open issue at https://github.com/CUTR-at-USF/GTFS/issues/1 - I'm happy to accept PRs there as well if others want to pitch in. I'll keep chipping away at it as I have time.
One open issue is how strictly we enforce constraints - for fields like stop.location_type
, values 0, 1, and 2 are defined in the spec. We can use the schema to enforce only these values and fail on any other value, but that breaks extensibility (i.e., a consumer/producer agreeing on value 4
outside the spec). Extensibility has always been allowed/encouraged, so we need to decide if we constrain the schema to the officially defined spec or allow outside values.
I've added comments where I've noticed this so far.
re: Conveyal gtfs-validator - we've gotten feedback from @sheldonabrown that he'd be willing to help get some of the changes by @laidig into the upstream Conveyal repo. If possible I'd like to do that to avoid forking the project. If not, I'd like to find a home for @laidig's changes in an organizational account where more than one person can be assigned to review/merge changes (we'd be willing to offer ours).
Learned from @mattwigway and @landonreed that instead of conveyal/gtfs-validator
they now use conveyal/gtfs-lib
to catch errors in GTFS feeds. When they're able to add a wrapper that can be called from the command line to produce validation output as JSON, then we'll set that up to run as part of Transitland's feed-fetch process.
@drewda, the docs for conveyal/gtfs-lib
have been updated with CLI usage instructions.
Both the Google Python FeedValidator and Conveyal gtfs-lib are now run automatically on production servers.
For example: https://transit.land/dispatcher/feed-versions/12ff6497fb2f6c5a9f568ec80c4be6ef928b0957
One bug to resolve on production #1069
When a new version of a feed is fetched from an agency server, we have the opportunity to run any number of validation libraries on the archive:
[ ]no longer maintainedgtfs-validator
Java-based, originally by Conveyal, outputs a JSON payload that can be viewed in a front-end web client; latest version at https://github.com/conveyal/gtfs-validator/blob/master/gtfs-validator-json/README.mdconveyal/gtfs-lib
Java-based, by Conveyal, is the library they currently use and maintain for validating GTFS: https://github.com/conveyal/gtfs-lib[ ] JSON Table Schema + Good Tables: https://github.com/Stephen-Gates/GTFSholding for now, since this isn't ready for general use//cc @barbeau @antrim