UrbanAnalyst / gtfsrouter

Routing and analysis engine for GTFS (General Transit Feed Specification) data
https://urbananalyst.github.io/gtfsrouter/
84 stars 17 forks source link

fix for integer stop_ids #33

Closed AlexandraKapp closed 4 years ago

AlexandraKapp commented 4 years ago

when calling gtfs_isochrone on feeds where stop IDs are integers (e.g. VRS Köln region) an error is thrown:

grafik

Not sure, if this is the right place to fix it, but it works.

mpadge commented 4 years ago

Thanks again for a great catch. I'll merge as soon as I get a chance

mpadge commented 4 years ago

I suspect actually that a better place to catch that would be in the initial extraction. The stop_id values are supposed to be a standard GTFS ID format, which in an arbitrary string of UTF8 characters. The basic read commands are, however, all done via data.table::fread: https://github.com/ATFutures/gtfs-router/blob/44f8e43df3dd85c7488e66e33e789967e021dcf6/R/extract-gtfs.R#L57-L59 These default to character for int64 values, but leave int values untouched (which is necessary). A few lines later the stop_id values are processed: https://github.com/ATFutures/gtfs-router/blob/44f8e43df3dd85c7488e66e33e789967e021dcf6/R/extract-gtfs.R#L87 My best guess would be to check for storage.mode of stop_ids there, and if !character, then coerce with as.character. That will ensure that stop_ids have the appropriate type in all tables. I can't check that out at the moment, but will leave it to you to confirm and modify accordingly. Much indebted!

mpadge commented 4 years ago

Can you now please run devtools::document() in order to generate the help file man/koeln_gtfs_to_zip.Rd, and then add that file to the PR? The tests should then pass. Then there is one last big task to address: The data/koeln_gtfs.Rda file is > 3MB, whereas the Berlin one is < 100KB. These files have to be small because they will be bundled with every installation of this package, and also because R packages are not allowed to exceed 5MB size in total. The explanation and code for how i reduced the Berlin data are contained in data-raw/data-script.Rmd. It should be easy to adapt that for the Köln data, and hopefully also reduce to <100KB in size. Thanks!

AlexandraKapp commented 4 years ago

Can you now please run devtools::document() in order to generate the help file man/koeln_gtfs_to_zip.Rd, and then add that file to the PR? The tests should then pass. Then there is one last big task to address: The data/koeln_gtfs.Rda file is > 3MB, whereas the Berlin one is < 100KB. These files have to be small because they will be bundled with every installation of this package, and also because R packages are not allowed to exceed 5MB size in total. The explanation and code for how i reduced the Berlin data are contained in data-raw/data-script.Rmd. It should be easy to adapt that for the Köln data, and hopefully also reduce to <100KB in size. Thanks!

Is the adding of the cologne feed + test a bit overkill for that small bugfix? Otherwise I can also delete it again

(sorry, I keep getting weird errors when running the checks locally, so the local testing doesnt work properly. I should get that fixed sometime soon ;) )

mpadge commented 4 years ago

@AlexandraKapp can you please update the branch of this PR to current master? There'll be some conflicts, but should be pretty easy to fix and just retain your changes. And yes, please delete the Köln feed, if that's okay. Thanks!