As demonstrated in https://github.com/rapidsai/cuspatial/pull/677, the geometry input of the header only API can quickly get out of control for complex geometry. We should try to simplify the APIs to improve developer experience.
Device view over the physical memory of nested types
The data structure should be a view over the physical memory layouts, thus it should be cheap to construct the data structure on host, passing the structure to device and invoke the method on device.
Unlike cudf::column_view and cudf::column_device_view, the data structure is completely templated (not type erased), and always assumes the view are device views.
Due to the complex nature of geometries, developers may want to support different patterns of access from the data structure.
Element-wise accessors (top-down access)
The simplest of all is element-wise accessor. A kernel is launched on a per-geometry level, accessors like .element(idx) should return a geometry object from the array.
Component-wise accessors (bottom-up access)
Other parallel patterns may require a thread to work on one component from the array, such as paralleled on a point of a multilinestring_array. In this case bottom-up traversal utilities should be supported, and should be easily implemented by a binary search in the offset array.
[UPDATE] 04/10/2023
Recent development added
multipoint_range
,multilinestring_range
,multipolygon_range
, which are flexible views over geometry arrays. cuspatial's API should be refactored using these data structures. For refactor demos. see: https://github.com/rapidsai/cuspatial/pull/979 (polygons argument ofquadtree_point_in_polygon
) and https://github.com/rapidsai/cuspatial/blob/branch-23.06/cpp/include/cuspatial/experimental/linestring_distance.cuhOriginal Post
As demonstrated in https://github.com/rapidsai/cuspatial/pull/677, the geometry input of the header only API can quickly get out of control for complex geometry. We should try to simplify the APIs to improve developer experience.
Device view over the physical memory of nested types
The data structure should be a view over the physical memory layouts, thus it should be cheap to construct the data structure on host, passing the structure to device and invoke the method on device.
Unlike
cudf::column_view
andcudf::column_device_view
, the data structure is completely templated (not type erased), and always assumes the view are device views.Assumes GeoArrow memory layout (#649)
The structures holds a view to offset arrays and point arrays. Offset arrays are assumed to always conform to arrow's offset array layout, which is specified in (https://arrow.apache.org/docs/format/Columnar.html#variable-size-binary-layout).
Accessors
Due to the complex nature of geometries, developers may want to support different patterns of access from the data structure.
Element-wise accessors (top-down access)
The simplest of all is element-wise accessor. A kernel is launched on a per-geometry level, accessors like
.element(idx)
should return a geometry object from the array.Component-wise accessors (bottom-up access)
Other parallel patterns may require a thread to work on one component from the array, such as paralleled on a point of a
multilinestring_array
. In this case bottom-up traversal utilities should be supported, and should be easily implemented by a binary search in the offset array.