Open-EO / openeo-geopyspark-driver

OpenEO driver for GeoPySpark (Geotrellis)
Apache License 2.0
25 stars 4 forks source link

regression: load_stac from MS Planetary Computer STAC API fails #784

Closed bossie closed 1 month ago

bossie commented 1 month ago

Terrascope integration test test_load_stac_from_planetary_computer_stac_api started failing:

Traceback (most recent call last):
  File "/opt/venv/lib64/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/venv/lib64/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/users/auth.py", line 95, in decorated
    return f(*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/views.py", line 673, in result
    result = backend_implementation.processing.evaluate(process_graph=process_graph, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 301, in evaluate
    return evaluate(process_graph=process_graph, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 377, in evaluate
    result = convert_node(result_node, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 416, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 416, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 416, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 416, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1611, in apply_process
    return process_function(args=ProcessArgs(args, process_id=process_id), env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 2232, in load_stac
    return env.backend_implementation.load_stac(url=url, load_params=load_params, env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/backend.py", line 760, in load_stac
    return load_stac.load_stac(url, load_params, env, layer_properties={}, batch_jobs=self.batch_jobs)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/load_stac.py", line 299, in load_stac
    for itm in intersecting_items:
  File "/opt/venv/lib64/python3.8/site-packages/pystac_client/item_search.py", line 694, in items
    yield Item.from_dict(item, root=self.client, preserve_dict=False)
  File "/opt/venv/lib64/python3.8/site-packages/pystac/item.py", line 488, in from_dict
    if not cls.matches_object_type(d):
  File "/opt/venv/lib64/python3.8/site-packages/pystac/item.py", line 563, in matches_object_type
    raise pystac.STACTypeError(d, cls, f"'{field}' is missing.")
pystac.errors.STACTypeError: JSON (id = unknown) does not represent a Item instance. 'type' is missing.
bossie commented 1 month ago

Underlying STAC API request:

https://planetarycomputer.microsoft.com/api/stac/v1/search?limit=20&bbox=3.143622080824514%2C51.30768529127022%2C3.272047105221418%2C51.365902618479595&datetime=2010-04-06T00%3A00%3A00Z%2F2010-04-06T23%3A59%3A59.999000Z&collections=landsat-c2-l2&fields=%2Bproperties.proj%3Aepsg%2C%2Bproperties.proj%3Ashape%2C%2Bproperties.proj%3Abbox

returns

{
  "type": "FeatureCollection",
  "features": [
    {
      "properties": {
        "proj:shape": [
          7371,
          8191
        ],
        "proj:epsg": 32631
      },
      "links": [
        {
          "rel": "collection",
          "type": "application/json",
          "href": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2"
        },
        {
          "rel": "parent",
          "type": "application/json",
          "href": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2"
        },
        {
          "rel": "root",
          "type": "application/json",
          "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
        },
        {
          "rel": "self",
          "type": "application/geo+json",
          "href": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LT05_L2SP_199024_20100406_02_T1"
        }
      ]
    }
  ],
  "links": [
    {
      "rel": "root",
      "type": "application/json",
      "href": "https://planetarycomputer.microsoft.com/api/stac/v1/"
    },
    {
      "rel": "self",
      "type": "application/json",
      "href": "https://planetarycomputer.microsoft.com/api/stac/v1/search?limit=20&bbox=3.143622080824514,51.30768529127022,3.272047105221418,51.365902618479595&datetime=2010-04-06T00:00:00Z/2010-04-06T23:59:59.999000Z&collections=landsat-c2-l2&fields=+properties.proj:epsg,+properties.proj:shape,+properties.proj:bbox"
    }
  ]
}

which is indeed not a valid STAC Item.

bossie commented 1 month ago

Regression was most likely introduced in https://github.com/Open-EO/openeo-geopyspark-driver/issues/781.

bossie commented 1 month ago

Fixed by recognizing that explicitly requesting proj: metadata (#781) from the MSPC STAC API makes it drop other properties in the response so we don't do that anymore for this STAC API (it was already treated special in the code).