rapidsai / cuspatial

CUDA-accelerated GIS and spatiotemporal algorithms
https://docs.rapids.ai/api/cuspatial/stable/
Apache License 2.0
620 stars 154 forks source link

[FEA] Assume GeoArrow Format in C++ #649

Open isVoid opened 2 years ago

isVoid commented 2 years ago

Currently, offset arguments for most c++ APIs are in ESRI shapefile polygon format (see shapefile whitepaper, page 10, table 7). The offsets array length equals the number of underlying components. Geoarrow contains one more element in the offset array, which is the length of the underlying component. Adopting geoarrow format in c++ will enable https://github.com/rapidsai/cuspatial/issues/637, and is cleaner when iterating over components in kernels.

Besides, geoarrow format assumes the point array is interleaved. The cudf column API should be refactored to accept interleaved array.

Finally, geoarrow supports multi-geometry, which is another layer of indirection addressing different parts. A single-geometry can be represented in a multi-geometry framework by having only one component for each part. In the header only API, we can make all APIs only aware of mult-geometry. User can "disguise" their "single-geometry" by passing a counting iterator as the part offset array. The benefit of this is that it compresses the API surface area by 2x.

573 aims to provide an example of a fully geoarrow compliant API to demonstrate all three above.

harrism commented 2 years ago

We should probably add a checklists of all functions that need to be updated for this... Also checklist items for documentation that should be updated (developer guide, for example)

isVoid commented 2 years ago

Checklist or separate issues?

harrism commented 2 years ago

Seems like that would be a LOT of issues...

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

harrism commented 2 years ago

This should be a milestone.