automated feature computation

Open-EO / openeo-gfmap

Generic framework for EO mapping applications building on openEO

Apache License 2.0

4 stars 0 forks source link

automated feature computation #4

Open jdries opened 6 months ago

jdries commented 6 months ago

For every reference dataset, we want to create a mirrored set of extracted features. This would typically be a GeoParquet file.

OpenEO can compute this Parquet file, and we can store it online (object storage, artifactory,...). We can then update the RDM with a link to the extractions for a given reference file?

For user trained models, we propose to add a constraint that the reference dataset should cover a limited area, to simplify smooth extraction. It is however possible to extend this with other, public, extractions??

Use of duckdb: DuckDB could be interesting for us, because it's an in-memory database, which means we don't need to set up a server. If we grow to the point where we do need one, we can still do it then.

Detailed design: https://confluence.vito.be/display/EP/WorldCereal

kvantricht commented 4 months ago

Depends on https://github.com/Open-EO/openeo-gfmap/issues/18

kvantricht commented 4 months ago

@VincentVerelst please define subtasks in this issue to split up the work.

kvantricht commented 3 months ago

start from (overarching) STAC catalogue with raw extractions + rasterized ground truth
load all datacubes into OpenEO (extraction NetCDF cubes) + merge with DEM collection
compute features: start from example UDF (applying cloud mask, compositing (monthly), apply_dimension to "compute" features from timestamps and channels, sampling ground truth pixels (based on this old code), writing result to geoparquet

VincentVerelst commented 3 months ago

Feature computation example UDF: https://git.vito.be/projects/APPL/repos/cropclass/browse/src/cropclass/classification.py?at=refs%2Fheads%2Fmain#311

VincentVerelst commented 3 months ago

load_stac results in no metadata in the Python client. Currently working to at least have the band names: https://github.com/Open-EO/openeo-python-client/issues/527

kvantricht commented 2 months ago

We might want to proceed by first defining a patch using OpenEO collections. Once everything works we can replace collection loaders by load_stac.

JeroenVerstraelen commented 2 months ago

This can be split up in smaller issues.

VincentVerelst commented 2 months ago

Moving this issue a few sprints further. Splitted up in subissues for basic (non-automated) feature computation.