Closed dfsnow closed 1 year ago
I'm not sure if these are the same errors that bubble up to r5r, but a lot of the errors seem to be more physical, file-structure related rather than logical attribute-value related. For instance, most files fail on tidytransit::read_gtfs(path)
or tidytransit::get_route_geometry(feed_sf)
before we're even able to try sf_is_valid(route_geom)
We should start to collect metadata on GTFS feeds (and corrective actions) used in this project in order to ensure reproducibility. Metadata can be stored in a simple flat file in
inputs/shared/feeds
and could contain fields such as date feed last updated (on TransitFeeds), feed location, feed name, list of potential errors/issues, and a list of corrective actions taken to fix said issues.Potential feed issues include:
We may even want to compile a collection of validated feeds + metadata and tarball it in this repo. This would ensure reproducibility but would be a lot of work.
For feed validation, I've used this tool in the past. Google also has a tool here. Rafa also mentions they're working on a gtfstools R package specifically for this purpose.