Open deanm0000 opened 5 months ago
Thanks for trying it out!
I agree having better geometry constructors will be necessary for usability. GeoArrow defines extension metadata that needs to be on an array to declare it a geometry. Your issue is that when you call pa.Table.from_arrays
, the field for each array is inferred from the data type of the arrays. But the inferred field won't have any metadata applied to it.
One way to fix this is to do use the schema
parameter of from_arrays
to ensure there's geoarrow metadata on the geometry column.
The other way is to register the pyarrow extension types provided in geoarrow-pyarrow
. In that case, I believe the extension metadata will be automatically inferred.
For now, I've put more effort into the IO readers and writers and into the GeoPandas and Shapely interoperability. So a simple way to get a GeoTable is to first create a geopandas.GeoDataFrame
and then use geoarrow.rust.core.from_geopandas
.
first create a geopandas.GeoDataFrame
I'm trying to quit doing that ;)
The other way is to register the pyarrow extension
That's what I really needed.
import geoarrow.pyarrow as ga
ga.register_extension_types()
Now, with my df
coming from polars, I can just do
df_geo = GeoTable.from_arrow(
df.to_arrow().add_column(
0, "geometry", [PointArray.from_xy(df["x"].to_arrow(), df["y"].to_arrow())]
)
)
I see that it says my geometry is a Struct but I thought it'd be a FixedSizeList
. Is that always the case or is that related to how I constructed it?
I'm trying to quit doing that ;)
Yes of course, but baby steps!
I see that it says my geometry is a Struct but I thought it'd be a
FixedSizeList
. Is that always the case or is that related to how I constructed it?
GeoArrow allows either FixedSizeList or Struct for coordinate buffers. PointArray.from_xy
always creates a StructArray
because that's how the memory is passed in.
https://github.com/geoarrow/geoarrow-rs/pull/578 will allow you more control over interleaved vs separated layout when constructing arrays from raw buffers. We should also add a helper to go back and forth between them when you already have your arrays.
I started from something like
then tried
but got
I also tried a few things around
ChunkedPointArray.from_arrow_arrays([my_point_array])
but none of it worked.