Open maximedion2 opened 5 months ago
Phase 0: This phase is about implementing the foundation for a query engine that shamelessly leverages that 1) Zarr is a heavily chunked up storage format and that 2) raster data typically involves some of the data representing some sort of coordinates, with most queries involving filtering on those coordinates. As I'm making this list, I already have the basics implemented, what's left is
Phase 1: This phase will be about implementing a more generic version of the query engine that can be implemented for various raster formats. The broad steps will be
Phase 2: This phase will be about implements efficient geospatial queries, that will work of off WKT strings. Realistically, I'm not going to implement a completely new type of data in DataFusion, I will have to rely on passing string to geospatial functions, or transforming data (like 2 floats for a point) into a string, that can then be passed to geospatial functions. The steps would be
@tshauck feel free to add anything here of course.
This will be a list of TODOs for the overall project of writing a query engine for Zarr files (and eventually other raster formats... maybe). I'm going to split the overall project in 3 phases, numbered 0, 1 and 2. Each TODO on the list will eventually be assigned an issue with more details and a PR for the implementation.