dfsnow / travel-time-matrices

Resources, scripts, and how-tos for calculating national travel time matrices
1 stars 1 forks source link

Collect metadata and validate GTFS feeds #1

Closed dfsnow closed 1 year ago

dfsnow commented 3 years ago

We should start to collect metadata on GTFS feeds (and corrective actions) used in this project in order to ensure reproducibility. Metadata can be stored in a simple flat file in inputs/shared/feeds and could contain fields such as date feed last updated (on TransitFeeds), feed location, feed name, list of potential errors/issues, and a list of corrective actions taken to fix said issues.

Potential feed issues include:

We may even want to compile a collection of validated feeds + metadata and tarball it in this repo. This would ensure reproducibility but would be a lot of work.

For feed validation, I've used this tool in the past. Google also has a tool here. Rafa also mentions they're working on a gtfstools R package specifically for this purpose.

eric-mc2 commented 3 years ago

I'm not sure if these are the same errors that bubble up to r5r, but a lot of the errors seem to be more physical, file-structure related rather than logical attribute-value related. For instance, most files fail on tidytransit::read_gtfs(path) or tidytransit::get_route_geometry(feed_sf) before we're even able to try sf_is_valid(route_geom)