JosiahParry / serde_esri

Esri JSON struct definitions and serde integration.
https://josiahparry.github.io/serde_esri/
7 stars 1 forks source link

Use separated Coords #11

Open JosiahParry opened 9 months ago

JosiahParry commented 9 months ago

Per recommendation separated x &y coordinates should be used when creating a geoarrow array as it will improve conversion to R types.

Comment: https://github.com/geoarrow/geoarrow-c/issues/78

Example affected lines: https://github.com/JosiahParry/serde_esri/blob/1f7f70e086463079bc2d99b925580e052b7382be/src/arrow_compat.rs#L209

@kylebarron

paleolimbot commented 9 months ago

FWIW, interleaved coords will still work and the optimizations that I'm talking about aren't implemented quite yet. The optimizations I'm talking about are:

If ESRI JSON is similar to GeoJSON I imagine constructing the interleaved version is easier to do from your end. All this to say that I think you can safely defer this one for quite a while if you'd like 🙂

JosiahParry commented 9 months ago

Thanks for the heads up! Esri JSON is quite quite close to geojson coords are structured like [[x, y, z, m], [x, y, z, m]]. Fortunately the conversion to geoarrow falls (almost) entirely on geoarrow-rs.

The way it works is that I have my struct EsriPolygon, for example, where I have to implement the trait PolygonTrait which defines how to get x & y coords out of the struct. Then once that trait is defined I get for "free" the ability to create a PolygonArray using geoarrow-rs. However, at present that assumes an interleaved coordinate buffer. But it seems like moving forward I'll have the option to choose the coordinate representation type https://github.com/geoarrow/geoarrow-rs/pull/279


Aside: I suspect any use of geoarrow or more low-level geometry representations will lead to a massive speed up regardless of a copy or not :)

paleolimbot commented 9 months ago

Aside: I suspect any use of geoarrow or more low-level geometry representations will lead to a massive speed up regardless of a copy or not :)

It's true. I've even found so far that Arrow WKB is a huge speed boost (compared to anything involving lists of R objects).