Closed jiayuasu closed 1 year ago
If geometry column name is different
var df = sparkSession.read.format("geoparquet").option("fieldGeometry", "new_geometry").load(geoparquetdatalocation1)
I don't know anything about sedona, but just wondering: are you able to read that from the parquet file metadata, instead of having the user specify it manually.
@kylebarron We read metadata but a binary type column in a Parquet file could be something else other than WKB. So we leave this to the user:
Unless the column is explicitly named geometry
, the user needs to tell us which column is the WKB column.
Thanks @jiayuasu! Will merge it in.
Following up on Kyle - the geoparquet spec provides the names of all the geometry columns, in the file metadata as JSON. So it seems like you could just look at that, instead of having a user specify? You can see an example of the metadata file at https://github.com/opengeospatial/geoparquet/blob/main/examples/example_metadata.json
@cholmes Got it! You made a good point. I believe Sedona should improve the reader/writer to leverage the metadata. We will make it happen. Thanks!
Dear maintainers,
Thank you all for the great work on GeoParquet!
This PR is to add Apache Sedona to the "known libraries that can read and write GeoParquet file"
Apache Sedona 1.3.0 implements the basic GeoParquet read/write function:
Please feel free to let me know if there is anything I need to fix :-)