Closed jonmmease closed 4 years ago
Apache Arrow actually has a "Fixed Size List" type: https://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout, which I think could be useful for this. But, this type is not yet exposed in the latest pyarrow release ..
And then, indeed a fixed width binary type is the way to do this yourself with the current pyarrow I think.
Ah, ok yeah. fixed size list looks like the right fit for these use cases if/when it becomes available through Python. Thanks!
And for somewhere in the future (normally once you would require pyarrow 1.0), FixedSizeListArray will be available: https://issues.apache.org/jira/browse/ARROW-7261
So far, spatialpandas supports "ragged" geometry types where the representation of the geometry objects in each row may differ in length (e.g. polygons with variable number of vertices). These types are backed by a pyarrow
ListArray
.It would also be nice to provide a more efficient representation of fixed size geometry objects. In particular, to represent a single point per row. Other use-cases would be to represent axis aligned boxes using two points.
One way to represent these would be to use pyarrow extension types backed by a fixed width binary storage type.
@jorisvandenbossche does this sound like a reasonable way to handle fixed length geometry types with pyarrow? Or would there be anything more straightforward?