Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
26 stars 4 forks source link

`date_between` in temporal `filter_labels` does not work for sentinelhub collections #749

Closed soxofaan closed 4 months ago

soxofaan commented 5 months ago

Using date_between in filter_labels along temporal dimension does not work.

e.g.

con = openeo.connect("openeo.vito.be")
con.authenticate_oidc()
raw = con.load_collection(
    "SENTINEL2_L2A",
    temporal_extent=["2023-06-01", "2023-09-30"],
    spatial_extent={"west": 4, "south": 51, "east": 4.01, "north": 51.01},
    bands=["B03"]
)
filtered = raw.filter_labels(
    dimension="t",
    condition=lambda data: openeo.processes.date_between(data, min="2023-06-15", max="2023-06-30")
)
filtered.download("filtered.nc")
ds = xarray.load_dataset("filtered.nc")
ds["B03"]["t"]

gives

      ['2023-06-04T00:00:00.000000000', '2023-06-09T00:00:00.000000000',
       '2023-06-14T00:00:00.000000000', '2023-06-19T00:00:00.000000000',
       '2023-06-24T00:00:00.000000000', '2023-06-29T00:00:00.000000000',
       '2023-07-04T00:00:00.000000000', '2023-07-09T00:00:00.000000000',
       '2023-07-14T00:00:00.000000000', '2023-07-19T00:00:00.000000000',
       '2023-07-24T00:00:00.000000000', '2023-07-29T00:00:00.000000000', ...

while only this is expected:

[
       '2023-06-19T00:00:00.000000000',
       '2023-06-24T00:00:00.000000000', '2023-06-29T00:00:00.000000000',
]
soxofaan commented 5 months ago

https://github.com/Open-EO/openeo-geopyspark-driver/issues/559 apparently was about adding temporal filter_labels with date_between. I'm not sure if this is a regression, or an uncovered use case

VictorVerhaert commented 5 months ago

filter_labels is used with date_between in this example notebook: https://github.com/Open-EO/openeo-community-examples/blob/main/python/ParcelDelineation/Parcel%20delineation.ipynb where I think it worked correctly.

soxofaan commented 5 months ago

Original use case of @hoetmaaiers was even a bit more complicated, conceptually something like:

filtered = raw.filter_labels(
    dimension="t",
    condition=lambda data: openeo.processes.any([
        openeo.processes.date_between(data, min="2023-06-15", max="2023-06-30"),
        openeo.processes.date_between(data, min="2023-08-15", max="2023-08-30"),
    ])
)

but even a single date_between does not work

soxofaan commented 5 months ago

filter_labels is used with date_between in this example notebook: https://github.com/Open-EO/openeo-community-examples/blob/main/python/ParcelDelineation/Parcel%20delineation.ipynb where I think it worked correctly.

interesting, so it's probably a regression

EmileSonneveld commented 4 months ago

On CDSE prod it works as expected. (Will test on https://openeo.vito.be soon)

url = "https://openeo.dataspace.copernicus.eu/"
connection = openeo.connect(url).authenticate_oidc()

datacube = connection.load_collection(
    "SENTINEL2_L2A",
    temporal_extent=["2021-01-01", "2021-03-01"],
    spatial_extent={
        "east": 5.08,
        "north": 51.22,
        "south": 51.215,
        "west": 5.07
    },
    bands=["B04"],
)

def build_condition(x):
    conditions = []
    dates = ["2021-01-02", "2021-01-05", "2021-02-01", "2021-02-04"]
    for date in dates:
        min_date = (datetime.datetime.fromisoformat(date)).isoformat() + "Z"
        max_date = (datetime.datetime.fromisoformat(date) + datetime.timedelta(days=1)).isoformat() + "Z"
        conditions.append(process("date_between", x=x, min=min_date, max=max_date))
    return any(conditions)

build_condition(5)
condition = build_child_callback(build_condition, parent_parameters=["value"])

datacube = datacube.process(process_id="filter_labels",
                            arguments={"data": datacube, "condition": condition},
                            dimension="t")

# datacube.download("filter_labels_example.nc")
custom_execute_batch(datacube)

out-2024-05-21_15_43_35.054635$ ls
job_id.txt        openEO_2021-01-02Z.tif  openEO_2021-02-01Z.tif  process_graph.json
job-results.json  openEO_2021-01-05Z.tif  openEO_2021-02-04Z.tif
EmileSonneveld commented 4 months ago

Also ok for https://openeo.vito.be So probably not a regression. I'll go deeper into this later

EmileSonneveld commented 4 months ago

In 2023, SENTINEL2_L2A_SENTINELHUB is used in the Terrascope backend. Here filter_labels + data_between does not seems to work then.