Open jdries opened 6 months ago
@VincentVerelst please define subtasks in this issue to split up the work.
Feature computation example UDF: https://git.vito.be/projects/APPL/repos/cropclass/browse/src/cropclass/classification.py?at=refs%2Fheads%2Fmain#311
load_stac
results in no metadata in the Python client. Currently working to at least have the band names: https://github.com/Open-EO/openeo-python-client/issues/527
We might want to proceed by first defining a patch using OpenEO collections. Once everything works we can replace collection loaders by load_stac
.
This can be split up in smaller issues.
Moving this issue a few sprints further. Splitted up in subissues for basic (non-automated) feature computation.
For every reference dataset, we want to create a mirrored set of extracted features. This would typically be a GeoParquet file.
OpenEO can compute this Parquet file, and we can store it online (object storage, artifactory,...). We can then update the RDM with a link to the extractions for a given reference file?
For user trained models, we propose to add a constraint that the reference dataset should cover a limited area, to simplify smooth extraction. It is however possible to extend this with other, public, extractions??
Use of duckdb: DuckDB could be interesting for us, because it's an in-memory database, which means we don't need to set up a server. If we grow to the point where we do need one, we can still do it then.
Detailed design: https://confluence.vito.be/display/EP/WorldCereal