Open soxofaan opened 1 year ago
Don't have a good answer to this one. For now, we kind of implement the 'no' approach, but we do try to preserve things like the feature identifier, as this is relevant to keep track of which timeseries belongs to which geometry.
The feature identifier is not part of the (GeoJSON) properties and belongs to the "core metadata" (as it resides at the top-level). That was at least always my "mental" model, based on GeoJSON. I'd think it's probably a good idea to keep track of it anyway. Maybe we need to clarify this?
My aim in 2.0.0 was to clearly communicate whether properties are preserved or not. Maybe it's not clear enough in all processes, but at least aggregate_spatial mentioned in the geometries parameter:
Feature properties are preserved for vector data cubes and all GeoJSON Features.
load_geojson, vector_to_random_points, vector_to_regular_points and vector_buffer similarly say:
Feature properties are preserved.
The feature identifier is not part of the (GeoJSON) properties and belongs to the "core metadata" (as it resides at the top-level).
I assume you are talking here about a "id" member of a Feature object, e.g.
{
"type": "Feature",
"id": "abc123",
"geometry": {...},
"properties": {...}
While this seems to be part of the GeoJSON RFC (If a Feature has a commonly used identifier, that identifier SHOULD be included as a member of the Feature object with the name "id"
), I think I've seen quite some cases in the wild where the "id" is under the "properties" instead of more at the "Feature" top level. Maybe this can be fixed by adding an option to GeoJSON loading processes to promote a given property to more standard "id".
My aim in 2.0.0 was to clearly communicate whether properties are preserved or not. Maybe it's not clear enough in all processes, but at least aggregate_spatial mentioned in the geometries parameter:
Feature properties are preserved for vector data cubes and all GeoJSON Features.
Well, part of the problem I'm trying to raise here is that there is a conflict here regarding vector cube design:
aggregate_spatial
, you can not preserve its original cube values because you are generating (aggregating) new cube values, which can not combined, generically, with the original cube values.For example:
You can not combine the original cube data ["geometry", "property"] with the aggregated cube data ["time", "bands", "geometry"] in a single cube, e.g. because the number of dimensions is different. The dimension type of "property" (type "other"?) and "bands" (type "bands") is probably also not compatible strictly speaking, but that could be adapted to relatively easy I guess.
So what I'm trying to say is this current statement in aggregate_spatial
Feature properties are preserved for vector data cubes and all GeoJSON Features.
is incompatible with the current consensus for vector cube design (store properties as cube values).
Sorry, new to opeEO, so please pardon if I am off-tangent.
From aggregae_spatial:
Feature properties are preserved for vector data cubes and all GeoJSON Features.
aggregate_spatial however somehow isn't preserving the properties, or "id" from my geojson features. I instead get "feature_index" but its difficult to tie it back to the original feature.
My aggregate_spatial logic looks like:
"aggregate29": { "process_id": "aggregate_spatial", "arguments": { "data": { "from_node": "load2" }, "geometries": { "type": "FeatureCollection", "features": [ { "type": "Feature", "id": "pp1", "geometry": { "type": "Point", "coordinates": [ 76.90113420870438, 23.06615990794603 ] }, "properties": { "pp": "Dinagat Islands", "kk":10 } } ] }, "reducer": { "process_graph": { "first1": { "process_id": "first", "arguments": { "data": [ { "from_parameter": "data" }, { "from_parameter": "context" } ], "ignore_nodata": false }, "result": true } } } } },
@pankajdpatil You are talking about a specific implementation. For support you need to contact the provider.
@pankajdpatil this thread is indeed about how to preserve e.g. feature ids, but on a more conceptual and back-end oriented level. I think your support request will be better served on an openEO forum like (depending the on the openEO backend you are using): https://forums.openeo.cloud/ or https://forum.dataspace.copernicus.eu/
(Related to use case experiments discussed in https://github.com/Open-EO/openeo-processes/issues/448 https://github.com/Open-EO/openeo-processes/issues/449)
Set up:
vc1
, loaded from a GeoJSON feature collection, where each feature is some polygon with some properties, e.g. crop type, population, an ML target value or class, ...cube
with e.g. NDVI datavc2 = aggregate_spatial(data=cube, geometries=vc1, reducer="mean")
Question: are the original GeoJSON-style properties of
vc1
still available invc2
?vc2
can directly be used to train a ML model?aggregate_spatial
only considersvc1
's geometry and ignores any existing cube data? The user then has to take some tedious steps to "join"/mergevc1
andvc2
again in order to use it for ML applications.I kind of remember vector cube discussions where we wanted preservation of properties (the "Yes" approach), e.g. using
aggregate_spatial
to "enrich" a vector cube with additional "columns" of aggregation data. However, I think the current design of vector cubes enforces the "No" approach because there are just cube values and you can not generically/automatically combine pre-existing cube data with new (aggregation) cube data.