Open thisisaaronland opened 12 months ago
Hi @thisisaaronland - thanks for reaching out about this. Yes, I think it makes sense to expose packages with functions for generating GeoParquet data.
If you have ideas about the ideal API that you'd like to use, maybe you can drop them here and we can discuss. I'm curious in particular about whether you would want to provide an Parquet (or Arrow) schema up front or if you would like this to be derived from the data.
Hi @tschaub
For starters I am not super knowledgeable about Parquet or Arrow but I have been watching the conversations around geoparquet and so this was an exercise to start getting more familiar and to prove that WOF data could be bundled in a new format. (One of the unofficial mottoes of the WOF project is: We don't need to have an opinion about your database :-)
The first thing I'd like to be able to is write a go-writer-geoparquet
package that implements to whosonfirst/go-writer.Writer
interface:
https://pkg.go.dev/github.com/whosonfirst/go-writer#Writer
A concrete example of that would be the go-writer-geojson
package:
https://github.com/whosonfirst/go-writer-featurecollection/blob/main/featurecollection.go
That would allow me to continue to use a common sets of interfaces for writing WOF documents to a variety of targets and encapsulate all the Parquet/Arrow specific details in the constructor and the URI used to create it.
Based on the short amount of time I've spent spelunking through the gpq
code it seems like just making the internal/geo*
packages public might be enough.
Hi,
Just checking about this. Has there been any (more) thought about exposing the code in internal
as public library code?
I haven't made any time for this yet unfortunately.
Hi,
I am interested in using
gpq
to generate GeoParquet files for Who's On First (WOF) data. Ideally I would like to do that by reading and writing data on a per-record basis rather than starting with a single GeoJSON file.Poking through the code it appears I can stream data to
gpq
via STDIN which would allow me using a similar approach to how we derive PMTiles from WOF data.That would solve me immediate problem but the functionality, specifically the convert functionality, wrapped by the
gpq
command would be generally useful to have a library code (outside ofinternal
).