Open-EO / openeo-geotrellis-extensions

Java/Scala extensions for Geotrellis, for use with OpenEO GeoPySpark backend.
Apache License 2.0
5 stars 3 forks source link

regression: load_stac of netCDF job results broke #282

Closed bossie closed 2 months ago

bossie commented 3 months ago

This used to work:

{
  "process": {
    "process_graph": {
      "loadstac1": {
        "process_id": "load_stac",
        "arguments": {
          "url": "https://openeo-staging.dataspace.copernicus.eu/openeo/1.2/jobs/j-240412a8e491496fadf086f87e73f162/results/ZGY3ZWE0NWQtZWNjNC00NTNmLThhZjktZGU4Y2ZiMTA1OGIx/f8601651e17632f8e7fa09e8403d772d?expires=1713519264"
        }
      },
      "saveresult1": {
        "process_id": "save_result",
        "arguments": {
          "data": {
            "from_node": "loadstac1"
          },
          "format": "GTiff",
          "options": {}
        },
        "result": true
      }
    }
  }
}

Only the asset href's path is taken into account, instead of the full URL.

Exception while determining data type of asset NETCDF:/openeo/1.2/jobs/j-240412a8e491496fadf086f87e73f162/results/assets/

Traceback (most recent call last):
  File "/opt/openeo/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/openeo/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/users/auth.py", line 88, in decorated
    return f(*args, **kwargs)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/views.py", line 655, in result
    result = backend_implementation.processing.evaluate(process_graph=process_graph, env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 301, in evaluate
    return evaluate(process_graph=process_graph, env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 377, in evaluate
    result = convert_node(result_node, env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in apply_process
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1579, in <dictcomp>
    args = {name: convert_node(expr, env=env) for (name, expr) in sorted(args.items())}
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 416, in convert_node
    return convert_node(processGraph['node'], env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 402, in convert_node
    process_result = apply_process(process_id=process_id, args=processGraph.get('arguments', {}),
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1611, in apply_process
    return process_function(args=ProcessArgs(args, process_id=process_id), env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 2232, in load_stac
    return env.backend_implementation.load_stac(url=url, load_params=load_params, env=env)
  File "/opt/openeo/lib/python3.8/site-packages/openeogeotrellis/backend.py", line 1123, in load_stac
    pyramid = pyramid_factory.datacube_seq(projected_polygons, from_date.isoformat(), to_date.isoformat(),
  File "/usr/local/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
    return_value = get_return_value(
  File "/usr/local/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o555.datacube_seq.
: java.io.IOException: Exception while determining data type of asset NETCDF:/openeo/1.2/jobs/j-240412a8e491496fadf086f87e73f162/results/assets/ZGY3ZWE0NWQtZWNjNC00NTNmLThhZjktZGU4Y2ZiMTA1OGIx/bb6abf3c7d7e401302cc632e083882aa/openEO_0.nc:B04 in collection https://openeo-staging.dataspace.copernicus.eu/openeo/1.2/jobs/j-240412a8e491496fadf086f87e73f162/results/ZGY3ZWE0NWQtZWNjNC00NTNmLThhZjktZGU4Y2ZiMTA1OGIx/f8601651e17632f8e7fa09e8403d772d?expires=1713519264. Detailed message: Unable to determine NoData value. GDAL Exception Code: 4
    at org.openeo.geotrellis.layers.FileLayerProvider.determineCelltype(FileLayerProvider.scala:848)
    at org.openeo.geotrellis.layers.FileLayerProvider.readKeysToRasterSources(FileLayerProvider.scala:876)
    at org.openeo.geotrellis.layers.FileLayerProvider.readMultibandTileLayer(FileLayerProvider.scala:1075)
    at org.openeo.geotrellis.file.PyramidFactory.datacube(PyramidFactory.scala:134)
    at org.openeo.geotrellis.file.PyramidFactory.datacube_seq(PyramidFactory.scala:91)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: geotrellis.raster.gdal.MalformedDataTypeException: Unable to determine NoData value. GDAL Exception Code: 4
    at geotrellis.raster.gdal.GDALDataset$.$anonfun$noDataValue$1(GDALDataset.scala:313)
    at geotrellis.raster.gdal.GDALDataset$.$anonfun$noDataValue$1$adapted(GDALDataset.scala:310)
    at geotrellis.raster.gdal.GDALDataset$.errorHandler$extension(GDALDataset.scala:422)
    at geotrellis.raster.gdal.GDALDataset$.noDataValue$extension1(GDALDataset.scala:310)
    at geotrellis.raster.gdal.GDALDataset$.cellType$extension1(GDALDataset.scala:366)
    at geotrellis.raster.gdal.GDALDataset$.cellType$extension0(GDALDataset.scala:361)
    at geotrellis.raster.gdal.GDALRasterSource.$anonfun$cellType$1(GDALRasterSource.scala:91)
    at scala.Option.getOrElse(Option.scala:189)
    at geotrellis.raster.gdal.GDALRasterSource.cellType$lzycompute(GDALRasterSource.scala:91)
    at geotrellis.raster.gdal.GDALRasterSource.cellType(GDALRasterSource.scala:91)
    at org.openeo.geotrellis.layers.BandCompositeRasterSource.$anonfun$cellType$1(FileLayerProvider.scala:93)
    at cats.data.NonEmptyList.map(NonEmptyList.scala:87)
    at org.openeo.geotrellis.layers.BandCompositeRasterSource.cellType(FileLayerProvider.scala:93)
    at org.openeo.geotrellis.layers.FileLayerProvider.determineCelltype(FileLayerProvider.scala:845)
    ... 16 more

Probably introduced in https://github.com/Open-EO/openeo-geotrellis-extensions/commit/742d1d8ad66ce9d732d4a3ee525fc5d6ce17917f.

jdries commented 2 months ago

wrong url handling got fixed normally, but then also found out that stac item metadata was not yet ideal, resulting in more errors.

jdries commented 2 months ago

I made a variety of improvements, but now again stuck on this one: https://github.com/Open-EO/openeo-geopyspark-driver/issues/762 I'm going to close this issue, deferring solution for netcdf loading to a following sprint.

bossie commented 2 months ago

A bit unclear but this issue has been fixed. :+1: