It's making it harder to move fast, and gives a ton of development overhead. If there's one lesson I should learn from DuckDB Spatial, it's that I'm not moving fast enough.
Any algorithms that return float values, such as area, should only take in a &dyn NativeArray + NativeArrayAccessor<2>.
So we want something like an NativeArrayAccessor<const D: usize>, so that we can implement algorithms on
This also means we can greatly simplify our IndexedArray implementation too. We should switch to IndexedNativeArray (or maybe IndexedNativeArray<const D: usize>) which holds only an `Arc<>
Also, anything like area that is implemented on geo::Geometry (or geos::Geometry) should be implemented solely on &dyn NativeArray + NativeArrayAccessor<2> (or maybe allow 3d as well there) and not on every array type.
The other issue here is that the Geometry scalar is generic over both dimension and the offset size of the underlying NativeArray. This is especially painful as the dimension matters for the Geometry but the offset size does not. It's only an artifact of the underlying storage.
I'm getting closer to removing support internally for i64 offset arrays. We can support ingesting that data but only represent i32 data internally. To max out i32 offsets, we'd need to have 2^31 + 1 (2,147,483,649) coordinates, which would be 34,359,738,384 bytes, or 32GB.
Especially the binary operations are a total disaster right now. This file is 750 lines of code just to implement intersections, when we aren't even doing any of the core implementation ourselves! This should be 10 lines of code, to take in a &dyn NativeArray + NativeArrayAccessor<2>, iterate over its geoarrow::scalar::Geometry<2> objects, and call intersects on each one.
I'm currently dying in generics.
It's making it harder to move fast, and gives a ton of development overhead. If there's one lesson I should learn from DuckDB Spatial, it's that I'm not moving fast enough.
Any algorithms that return float values, such as
area
, should only take in a&dyn NativeArray + NativeArrayAccessor<2>
.So we want something like an
NativeArrayAccessor<const D: usize>
, so that we can implement algorithms onThis also means we can greatly simplify our
IndexedArray
implementation too. We should switch toIndexedNativeArray
(or maybeIndexedNativeArray<const D: usize>
) which holds only an `Arc<>Also, anything like
area
that is implemented ongeo::Geometry
(orgeos::Geometry
) should be implemented solely on&dyn NativeArray + NativeArrayAccessor<2>
(or maybe allow 3d as well there) and not on every array type.The other issue here is that the
Geometry
scalar is generic over both dimension and the offset size of the underlying NativeArray. This is especially painful as the dimension matters for the Geometry but the offset size does not. It's only an artifact of the underlying storage.I'm getting closer to removing support internally for i64 offset arrays. We can support ingesting that data but only represent i32 data internally. To max out
i32
offsets, we'd need to have2^31 + 1
(2,147,483,649) coordinates, which would be 34,359,738,384 bytes, or 32GB.Especially the binary operations are a total disaster right now. This file is 750 lines of code just to implement intersections, when we aren't even doing any of the core implementation ourselves! This should be 10 lines of code, to take in a
&dyn NativeArray + NativeArrayAccessor<2>
, iterate over itsgeoarrow::scalar::Geometry<2>
objects, and callintersects
on each one.