Open-EO / openeo-gfmap

Generic framework for EO mapping applications building on openEO
Apache License 2.0
4 stars 0 forks source link

failing test tests.test_openeo_gfmap.test_cloud_masking.test_bap_quintad[Backend.TERRASCOPE] #31

Closed soxofaan closed 2 weeks ago

soxofaan commented 5 months ago

tests.test_openeo_gfmap.test_cloud_masking.test_bap_quintad[Backend.TERRASCOPE] currently fails with this in the error logs:

  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1584, in apply_process
    return process_function(args=ProcessArgs(args, process_id=process_id), env=env)
  File "/opt/venv/lib64/python3.8/site-packages/openeo_driver/ProcessGraphDeserializer.py", line 1009, in aggregate_temporal
    return data_cube.aggregate_temporal(
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 846, in aggregate_temporal
    return self._apply_to_levels_geotrellis_rdd(
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 186, in _apply_to_levels_geotrellis_rdd
    pyramid = Pyramid({
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 187, in <dictcomp>
    k: self._create_tilelayer(func(l.srdd.rdd(), k), l.layer_type if target_type==None else target_type , k)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 847, in <lambda>
    lambda rdd, level: pysc._jvm.org.openeo.geotrellis.OpenEOProcesses().aggregateTemporal(rdd,
  File "/opt/spark3_4_0/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
    return_value = get_return_value(
  File "/opt/spark3_4_0/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o3178.aggregateTemporal.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 21.0 failed 4 times, most recent failure: Lost task 0.3 in stage 21.0 (TID 171) (epod052.vgt.vito.be executor 6): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/opt/spark3_4_0/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main
    process()
  File "/opt/spark3_4_0/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process
    serializer.dump_stream(out_iter, outfile)
  File "/opt/spark3_4_0/python/lib/pyspark.zip/pyspark/serializers.py", line 146, in dump_stream
    for obj in iterator:
  File "/opt/spark3_4_0/python/lib/pyspark.zip/pyspark/util.py", line 81, in wrapper
    return f(*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/utils.py", line 54, in memory_logging_wrapper
    return function(*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/epsel.py", line 44, in wrapper
    return _FUNCTION_POINTERS[key](*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/epsel.py", line 37, in first_time
    return f(*args, **kwargs)
  File "/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 582, in tile_function
    result_array = result_array.transpose(*('t' ,'bands','y', 'x'))
  File "/opt/venv/lib64/python3.8/site-packages/xarray/core/dataarray.py", line 2154, in transpose
    dims = tuple(utils.infix_dims(dims, self.dims))
  File "/opt/venv/lib64/python3.8/site-packages/xarray/core/utils.py", line 726, in infix_dims
    raise ValueError(
ValueError: ('t', 'bands', 'y', 'x') must be a permuted list of ('t', 'y', 'x'), unless `...` is included
soxofaan commented 5 months ago

Did some initial digging and I think this is going on: there is a UDF https://github.com/Open-EO/openeo-gfmap/blob/main/src/openeo_gfmap/preprocessing/udf_rank.py used in apply_neighborhood that takes a t-bands-y-x array and returns a t-y-x array (eliminating the band dimension). Our code seems to expect an array with bands dimension.

GriffinBabe commented 2 months ago

@soxofaan @jdries Seems like an OpenEO issue, as if I'm adding the band dimension manually to the DataCube before running the UDF, I get the error:

openeo.metadata.DimensionAlreadyExistsException: Dimension with name 'bands' already exists

Also before running the UDF, the following line is done, and the job doesn't complain before entering the UDF

score = score.rename_labels("bands", [BAPSCORE_HARMONIZED_NAME])

kvantricht commented 2 months ago

@GriffinBabe to provide some more context on the issue (e.g. full exception)

GriffinBabe commented 2 months ago

@soxofaan An example can be found on CDSE with job-id: j-240412ff075d4b6880c6f9945298349c

The full error message is:

OpenEO batch job failed: Exception during Spark execution: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main process() File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process serializer.dump_stream(out_iter, outfile) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 146, in dump_stream for obj in iterator: File "/usr/local/spark/python/lib/pyspark.zip/pyspark/util.py", line 81, in wrapper return f(*args, kwargs) File "/opt/openeo/lib/python3.8/site-packages/openeogeotrellis/utils.py", line 56, in memory_logging_wrapper return function(*args, *kwargs) File "/opt/openeo/lib/python3.8/site-packages/epsel.py", line 44, in wrapper return _FUNCTION_POINTERS[key](args, kwargs) File "/opt/openeo/lib/python3.8/site-packages/epsel.py", line 37, in first_time return f(*args, *kwargs) File "/opt/openeo/lib/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py", line 524, in tile_function result_array = result_array.transpose(('t' ,'bands','y', 'x')) File "/opt/openeo/lib/python3.8/site-packages/xarray/core/dataarray.py", line 2154, in transpose dims = tuple(utils.infix_dims(dims, self.dims)) File "/opt/openeo/lib/python3.8/site-packages/xarray/core/utils.py", line 726, in infix_dims raise ValueError( ValueError: ('t', 'bands', 'y', 'x') must be a permuted list of ('t', 'y', 'x'), unless '...' is included

Basically my datacube is expected to have the "band" dimension when entering in the UDF, but it hasn't. It's a weird result, as I tried even to load more than one band as a first attempt to fix the issue (SCL and B04), but it didn't work

I tried then to add a dimension: https://github.com/Open-EO/openeo-gfmap/blob/main/src/openeo_gfmap/preprocessing/cloudmasking.py#L186

But I get the following error from the python client:

openeo.metadata.DimensionAlreadyExistsException: Dimension with name 'bands' already exists

It's already weird to have no band dimension at the entrance of the UDF, but here additionally there seems to be a problem between the backend and client representation of the datacube

JeroenVerstraelen commented 2 months ago

@GriffinBabe The error that you show is on this line in the openEO backend code: result_array = result_array.transpose(*('t' ,'bands','y', 'x')) Where result_array refers to the output of your udf_rank.py UDF.

Apply_neighborhood expects your output to have these dimensions ('t' ,'bands','y', 'x') but this line in the UDF creates a dataarray with ('t', 'y', 'x') dimensions: bap_score = array.sel(bands="S2-L2A-BAPSCORE")

To solve it you can use this line instead: bap_score = array.sel(bands=["S2-L2A-BAPSCORE"])