Open dfsnow opened 3 weeks ago
I tested this out and the geometry that gets written to the parquet file is no longer WKB, which interferes with our SQL spatial joins and distance calculations. We can hack around it using
mutate(across(starts_with('geometry'), ~ hex2raw(st_as_binary(.x, hex = TRUE))))
but it seems like we'd primarily lose rather than gain functionality for our purposes by switching to the new version.
Edit - From the maintainer:
If you need a workaround, you can create the WKB-encoded table yourself from an sf
library(sf) library(geoarrow) nc <- read_sf(system.file("gpkg/nc.gpkg", package = "sf")) df <- tibble::as_tibble(nc) df$geom <- as_geoarrow_vctr(df$geom, geoarrow_wkb()) tbl <- arrow::as_arrow_table(df) ...and add metadata using tbl$metadata$geo = "{...}"
The
read_geoparquet()
andwrite_geoparquet()
functions used in our ETL ingest scripts are now deprecated, as is the CRANgeoarrow
library from which they are sourced. We should switch to the new geoarrow backend located here: https://github.com/geoarrow/geoarrow-rThis will involve updating our various scripts and renv to point to the new geo/nanoarrow package, and replacing the dedicated geoparquet functions with their equivalent generics.