Open polettif opened 2 months ago
Hi @polettif, sorry for the long time to answer.
I don't know much about the geojson format, but I'll have a look at how it handles spatial data.
I also don't have strong opinions on how to read it. Just like the official spec deviated from using WKT to represent spatial data (relevant discussion), we could also use {sf} to read locations.geojson, while still reading shapes and stops as plain tables to prevent backward incompatibility. Thoughts?
We should probably also start handling the other recently added files correctly. That would require the files to be added to get_gtfs_standards()
, which shouldn't be too much of a problem either.
I'll probably have some time to look at it next week.
@dhersz My thoughts are mainly that this will be difficult without introducing very heavy dependencies into this otherwise nicely lightweight package. My general suspicion would be that most users won't need or want locations.geojson
, and of those that do, most will likely be able to handle the "raw" geojson data themselves anyway. So introducing a heavy dep just for a sub-case of a sub-case seems unnecessary.
There are three main ways that I currently use to handle geojson, in increasing complexity:
jsonlite::read_json()
(<250kB of libs), to just store as a simple list which can easily be converted later to some "spatial" format if desired, or spatial data easily extracted via lapply
calls.geojsonio::geojson_read()
, but geojsonio
(~800Kb of libs) requires V8
, which is a huge dependency (>1.5MB of libs) so no-go in my opinion.sf::st_read()
, again no-go in my opinion because then the majority of the dependency footprint of this package will be eaten up by sf
(~6MB not including system deps!) for a very minor use case.For context, this libs of this package are around 200kB, so currently excellently small (with data.table
obviously the largest dep, but unavoidable here.) My personal recommendation would be jsonlite
, with sufficient explanation and docs to explain how to convert to other formats if desired.
Great comment, thanks @mpadge!
So I suggest using {jsonlite}
to handle the geojson and then adding the operation of converting the list to sf to the functions in {gtfstools}
and {tidytransit}
that convert plain GTFS objects to dt_gtfs
and tidygtfs
objects, respectively. Since both packages already import {sf}
, this should not be a problem, and we preserve the light dependencies of {gtfsio}
.
There have been some updates to the GTFS reference in March, one of which is the new locations.geojson file.
This clashes with the previous assumption that all files in a feed, even supplemental or custom files, should be comma delimited
.txt
files. Thus, gtfsio ignores the files and issues a warning.Now, I don't know what the best approach to handling these files is, as gtfsio does not handle spatial data. I'm open to just reading it as a text and let the other packages handle the conversion to a spatial format. We just shouldn't ignore them as a base line.