Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
25 stars 4 forks source link

worldcereal: sample points from rastercube into geoparquet #651

Open jdries opened 5 months ago

jdries commented 5 months ago

worldcereal samples per patch of 64x64 pixels and has a spatial method, so needs the original patch that results in a set of points for which the other data needs to be sampled

the sampling algorithm is parametrized so we want to avoid having to run that as a separate job

writing a sparse rastercube to geoparquet could also be a solution

strategy0: work with backend as-is

Need to implement support for loading netCDF patches. https://github.com/Open-EO/openeo-geotrellis-extensions/issues/259

strategy1: pre-aligned samples

define utm grids of e.g. 128 pixels, and ensure that samples fall within those gridcells. We can even consider two shifted grids, which would ensure that a 64 pixel patch would always fall within one of those grids?

strategy2: non-gridded datacube

Allow datacubes that contain raster timeseries with a temporal key and extent rather than a grid index.

In this scenario, we would need to have a special load_collection for this type of cube, and support the (few) processes that worldcereal would like to apply to this type of cube.

Hardest part here is perhaps a merge cubes?

This would allow for a lot of specific performance optimization as well.

strategy3: work in vector cubes

if we very early on convert raster cube into vector cube, by sampling the pixels of interest, we can just work with vector processes. Only problem being: we haven't implemented most vector processes.