Open-EO / openeo-processes-dask

Python implementations of many OpenEO processes, dask-friendly by default.
Apache License 2.0
19 stars 14 forks source link

Process graph of load_collection properties fails for more generic process graphs. #239

Open danielFlemstrom opened 6 months ago

danielFlemstrom commented 6 months ago

Not sure if this is intended to work, but if you want to filter by a property in a range the client outputs a process-graph but it cannot be translated by openeo-processes-dask Exception <class 'openeo_processes_dask.process_implementations.exceptions.ProcessParameterMissing'> Error: Process Parameter value was missing for process gte

image Process graph for the last cases: { "process_graph": { "loadcollection1": { "process_id": "load_collection", "arguments": { "bands": [ "chl_nn" ], "id": "s3_olci_l2wfr", "properties": { "eo:cloud_cover": { "process_graph": { "gte1": { "process_id": "gte", "arguments": { "x": { "from_parameter": "value" }, "y": 5 } }, "lte1": { "process_id": "lte", "arguments": { "x": { "from_parameter": "value" }, "y": 10 } }, "and1": { "process_id": "and", "arguments": { "x": { "from_node": "gte1" }, "y": { "from_node": "lte1" } }, "result": true } } } }, "spatial_extent": { "west": 15.862841471258125, "east": 17.52375679097526, "south": 59.188299941115375, "north": 59.7224489782685 }, "temporal_extent": [ "2020-07-01T00:00:00Z", "2023-07-31T00:00:00Z" ] }, "result": true } } }

I would expect to get the process graph in propertiesin load_collection.

clausmichele commented 5 months ago

@danielFlemstrom just a note, it's better if you copy and paste the code instead of taking a screenshot, so that we can easily reproduce it!

danielFlemstrom commented 5 months ago


def prop_range(pval):
    lte1 = process("lte", x = pval, y = 10)
    gte1 = process("gte", x = pval, y = 5)
    and1 = process("and", x = gte1, y=lte1)
    return and1

def simple_prop(pval):
    lte1 = process("lte", x = pval, y = 10)
    return lte1

res = conn.load_collection(collection_id = s2.s2_msi_l2a, 
                                   spatial_extent = {"west": 15.862841471258125, 
                                                     "east": 17.52375679097526,
                                                     "south": 59.188299941115375, 
                                                     "north": 59.7224489782685},
                                   temporal_extent = ["2020-07-01T00:00:00Z", "2023-07-31T00:00:00Z"],
                                   bands = ["b04",],

                                   properties = {"eo:cloud_cover":  lambda val: val < 10} # Try 5 for 2 images, and 10 for 14 images
                                   # OK properties = {"eo:cloud_cover": simple_prop })
                                   # below is a workaraound
                                   #properties = {"eo:cloud_cover": lambda pval:process("lte", x = pval, y = 10),
                                   #              "cloud_cover": lambda pval:process("gte", x = pval, y = 10) })
                                   # NO properties = {"eo:cloud_cover":  lambda val: openeo.processes.and_(val > 22 ,val < 10)})
                                   # NO properties = {"eo:cloud_cover": prop_range })
                          )
res.flat_graph()
clausmichele commented 4 months ago

@ValentinaHutter do you at EODC have collections where you can use properties for filtering?

ValentinaHutter commented 4 months ago

I now checked our internal implementation of load_collection - we also just expect the input properties to be a dictionary and resolve the stac client search from the dictionary. For us, we also only used one property so far. We will let you know, when we manage to update it in the pg-parser.

danielFlemstrom commented 4 months ago

A dictionary is fine for me as long as it get passed on by the pgparser. We parse the dictionary and create a filtering expression that is sent in to the odc.load function (Since we are using open datacube for loading collections), if that is of any help.