climet-eu / compression-lab-notebooks

Notebooks for the Online Compression Laboratory for Climate Science and Meteorology
https://compression.lab.climet.eu
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Zfp compressor: WASM guest runtime error #8

Closed SF-N closed 2 months ago

SF-N commented 2 months ago

Running the Zfp compressor on the Example data from here leads to the following error:

ds = utils.open_dataset(dataset_path)
da = []
da.append(ds["t"].sel(dict(step="00:00:00")).drop_vars("valid_time"))

transform_compressor = []
da_transform = {}
stats_transform = {}

for i, d in enumerate(da):
    transform_compressor.append(
        [
            fcbench.codecs.Log(),
            fcbench.codecs.Zfp(mode="fixed-accuracy", tolerance=1e-3),
        ]
    )
    da_transform[d.name], stats_transform[d.name] = (
        fcbench.compressor.compute_dataarray_compress_decompress(
            d, transform_compressor[i]
        )
    )
    print(f"{da[i].long_name}" + ":")
    display(
        utils.format_compress_stats(transform_compressor[i], stats_transform[d.name])
    )
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[14], line 13
      5 for i, d in enumerate(da):
      6     transform_compressor.append(
      7         [
      8             fcbench.codecs.Log(),
      9             fcbench.codecs.Zfp(mode="fixed-accuracy", tolerance=1e-3),
     10         ]
     11     )
     12     da_transform[d.name], stats_transform[d.name] = (
---> 13         fcbench.compressor.compute_dataarray_compress_decompress(
     14             d, transform_compressor[i]
     15         )
     16     )
     17     print(f"{da[i].long_name}" + ":")
     18     display(
     19         utils.format_compress_stats(transform_compressor[i], stats_transform[d.name])
     20     )

File /lib/python3.11/site-packages/xarray/core/dataarray.py:1163, in DataArray.compute(self, **kwargs)
   1144 """Manually trigger loading of this array's data from disk or a
   1145 remote source into memory and return a new array. The original is
   1146 left unaltered.
   (...)
   1160 dask.compute
   1161 """
   1162 new = self.copy(deep=False)
-> 1163 return new.load(**kwargs)

File /lib/python3.11/site-packages/xarray/core/dataarray.py:1137, in DataArray.load(self, **kwargs)
   1119 def load(self, **kwargs) -> Self:
   1120     """Manually trigger loading of this array's data from disk or a
   1121     remote source into memory and return this array.
   1122 
   (...)
   1135     dask.compute
   1136     """
-> 1137     ds = self._to_temp_dataset().load(**kwargs)
   1138     new = self._from_temp_dataset(ds)
   1139     self._variable = new._variable

File /lib/python3.11/site-packages/xarray/core/dataset.py:853, in Dataset.load(self, **kwargs)
    850 chunkmanager = get_chunked_array_type(*lazy_data.values())
    852 # evaluate all the chunked arrays simultaneously
--> 853 evaluated_data = chunkmanager.compute(*lazy_data.values(), **kwargs)
    855 for k, data in zip(lazy_data, evaluated_data):
    856     self.variables[k].data = data

File /lib/python3.11/site-packages/xarray/core/daskmanager.py:70, in DaskManager.compute(self, *data, **kwargs)
     67 def compute(self, *data: DaskArray, **kwargs) -> tuple[np.ndarray, ...]:
     68     from dask.array import compute
---> 70     return compute(*data, **kwargs)

File /lib/python3.11/site-packages/dask/base.py:661, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    658     postcomputes.append(x.__dask_postcompute__())
    660 with shorten_traceback():
--> 661     results = schedule(dsk, keys, **kwargs)
    663 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File /lib/python3.11/site-packages/xarray/core/parallel.py:269, in map_blocks.<locals>._wrapper(func, args, kwargs, arg_is_array, expected)
    259 """
    260 Wrapper function that receives datasets in args; converts to dataarrays when necessary;
    261 passes these to the user function `func` and checks returned objects for expected shapes/sizes/etc.
    262 """
    264 converted_args = [
    265     dataset_to_dataarray(arg) if is_array else arg
    266     for is_array, arg in zip(arg_is_array, args)
    267 ]
--> 269 result = func(*converted_args, **kwargs)
    271 # check all dims are present
    272 missing_dimensions = set(expected["shapes"]) - set(result.sizes)

RuntimeError: WASM guest raised an error

The data is available at https://faubox.rrze.uni-erlangen.de/dl/fiVNDKPexNmtq3rdC4rXDf/t_n1_ml91.grib.

juntyr commented 2 months ago

I'm working on exposing the inner errors at the moment, next week we should have a more informative message

juntyr commented 2 months ago

The new errors is

Exception: Log does not support non-positive (negative or zero) floating point data

The above exception was the direct cause of the following exception:
Exception: WASM guest raised an error

The temperature data you load in has negative values, but you then take a ln(x) transform on that.

We can instead first convert the temperature to Kelvin and use

[
    # transform Celcius to Kelvin
    fcbench.codecs.FixedOffsetScale(offset=-273.15, scale=1.0),
    fcbench.codecs.Log(),
    fcbench.codecs.Zfp(mode="fixed-accuracy", tolerance=1e-3),
]