Open-EO / openeo-processes

Interoperable processes for openEO's big Earth observation cloud processing.
https://processes.openeo.org
Apache License 2.0
48 stars 16 forks source link

resample_cube_temporal: behaviour when valid_within is provided #371

Closed zcernigoj closed 2 years ago

zcernigoj commented 2 years ago

Process ID: resample_cube_temporal

Describe the issue: Checking if I understand the description for valid_within correctly.

Setting this parameter to a numerical value enables that the process searches for valid values within the given period of days before and after the target timestamps. Valid values are determined based on the function is_valid. For example, the limit of 7 for the target timestamps 2020-01-15 12:00:00 looks for a nearest neighbor after 2020-01-08 12:00:00 and before 2020-01-22 12:00:00. If no valid value is found within the given period, the value will be set to no-data (null).

This suggests that for a certain date (timestamp / label) from target DataCube, the nearest valid values from data DataCube can be from different dates (timestamps / labels), not from only one "alternative" nearest date (timestamp / label). This also feels the more correct behaviour than any different behaviour.

Is this correct?

Proposed solution: Not sure how this should be solved. Mostly just checking if the described behaviour is the correct planned one.

Additional context:

Related issue: https://github.com/Open-EO/openeo-processes/issues/194 Related PR: https://github.com/Open-EO/openeo-processes/pull/244

Related issue / comment in openeo-processes-python: https://github.com/Open-EO/openeo-processes-python/issues/96#issuecomment-1138491467

Related PR in openeo-pg-evalscript-converter: https://github.com/openEOPlatform/openeo-pg-evalscript-converter/pull/77

clausmichele commented 2 years ago

This suggests that for a certain date (timestamp / label) from target DataCube, the nearest valid values from data DataCube can be from different dates (timestamps / labels), not from only one "alternative" nearest date (timestamp / label). This also feels the more correct behaviour than any different behaviour.

You refer to a case like: input data at time steps: 2020-01-09 12:00:00, 2020-01-21 12:00:00 target time steps: 2020-01-15 12:00:00 valid within: 7 days

and the issue here would be which input time step to choose right? In the process description this scenario is already mentioned!

The rare case of ties is resolved by choosing the earlier timestamps.

zcernigoj commented 2 years ago

Yes, that was my concern. Thanks.