Open alxmrs opened 9 months ago
This should be possible to demo once #8 is complete. If we figure this out, we should document it in the README.
I’ve been reading more into how this is done in the status quo. The best example I can find for joining rasters and point data (and vectors) comes from using a hierarchical spatial index like h3 or s2.
https://github.com/uber/h3-py-notebooks/blob/master/notebooks/unified_data_layers.ipynb
I wonder if this is the technique that underpins Fused.io.
For non-geospatial data, could we use a kdtree to create a hierarchical index? 🤔
This podcast episode is incredibly validating of the use case that this library (and issue) solves.
https://github.com/DahnJ/H3-Pandas
This gives me more confidence that an index system (geospatial via s2 and h3, or pre-computed via kdtrees) is a good integration. To me, this is proof of demand for such features.
Here's an example workflow that I'd like to support once this feature exists. This is from Jake Wall of the Mara Elephant Project. Here, he would make use of raster and table data from Earth Engine.
I'm imagining this would look like a left join from a Dask Dataframe that had the elephant coordinates to an EE ImageCollection that was opened with Xee via Qarray. Some details are fuzzy, like how we'd interject a NN lookup (maybe, this could be done via a SQL aggregation?).
In general, I think there is broad demand for being able to join raster and tabular data with each other. Later in the line, I bet we could implement geo-aware joins that would make use of geometry.