Open-EO / openeo-opensearch-client

Simple opensearch client for openeo.
Apache License 2.0
0 stars 0 forks source link

support tileId filter with wildcard on CDSE #25

Closed jdries closed 1 year ago

jdries commented 1 year ago

Specifically for Sentinel-2 layers on CDSE: we want to filter on the tile id of a feature. This currently works in load_collection: properties=dict(tileId=lambda x:x=="31UFS")

But now we need to filter on multiple id's. Cloudferro does not support that. My proposal is to allow a wildcard match like 31*

We would need to implement that with filtering on our side instead of sending the property filter to the catalog.

bossie commented 1 year ago

Some environment variables that need to be set in order to test this locally (required by our own code or by GDAL):

* S3 endpoint data.cloudferro.com (see AWS_S3_ENDPOINT envar for driver pods on CDSE) is only accessible from Cloudferro itself; use endpoint https://eodata.cloudferro.com instead.

bossie commented 1 year ago

Available on CDSE-staging.

Example process graph:

{
  "process_graph": {
    "load1": {
      "process_id": "load_collection",
      "arguments": {
        "id": "SENTINEL2_L2A",
        "spatial_extent": {
          "west": 4.912844218500582,
          "east": 4.918160603369832,
          "south": 51.02816932187383,
          "north": 51.029815337603594
        },
        "temporal_extent": [
          "2023-09-24T00:00:00Z",
          "2023-09-25T00:00:00Z"
        ],
        "bands": [
          "B04",
          "B03",
          "B02"
        ],
        "properties": {
          "tileId": {
            "process_graph": {
              "eq1": {
                "process_id": "eq",
                "arguments": {
                  "x": {
                    "from_parameter": "value"
                  },
                  "y": "31*"
                },
                "result": true
              }
            }
          }
        }
      }
    },
    "save2": {
      "process_id": "save_result",
      "arguments": {
        "data": {
          "from_node": "load1"
        },
        "format": "GTIFF"
      },
      "result": true
    }
  },
  "parameters": []
}

There are some debug logs re: what's filtered out client side e.g.:

retaining feature /eodata/Sentinel-2/MSI/L2A/2023/09/24/S2B_MSIL2A_20230924T103659_N0509_R008_T31UFS_20230924T132849.SAFE with tileId 31UFS

or

omitting feature /eodata/Sentinel-2/MSI/L2A/2023/09/24/S2B_MSIL2A_20230924T103659_N0509_R008_T31UFS_20230924T132849.SAFE with tileId 31UFS

DeRooBert commented 1 year ago

I tried with the following jobs but they error when trying to start. j-8d12cd5f41cc44688cdf34dde8ff0725 filter : 30SU* j-026a0ae8d9e14ba69ee798055f07704d filter : 30SWE (This kind if filters do run on openeo-3-1.openeo-vlcc-prod.vgt.vito.be)

bossie commented 1 year ago

What's the error? I can't tell from the logs.

bossie commented 1 year ago

There were some issues wrt/ starting batch jobs so please try again.

DeRooBert commented 1 year ago

,finished,j-f5c4b9fa566d4df4a060f52b2166f8aa filter : 30SU* finished,j-d2d6ffb18ae748ac8115e6db5c11a821 filter : 30SWE Seems to work now. No visible artefacts.