OpenDataCubePipelines / ard-pipeline

Processing satellite imagery into Analysis Ready Data (ARD)
Apache License 2.0
0 stars 2 forks source link

Failure & coredump `check-environment.sh` using `blosc` filter #59

Open truth-quark opened 1 day ago

truth-quark commented 1 day ago

On my local system, after installing conda & building the environment as per the README. Running deployment/check-environment.sh failed here:

https://github.com/OpenDataCubePipelines/ard-pipeline/blob/ea84164ab0cd4fb13188984bfd4332d7eb6b59e1/deployment/check-environment.sh#L55

The output:

# ... other tests passing

Attempting hdf5 blosc compression...python3: /home/conda/feedstock_root/build_artifacts/libnetcdf_1702228817425/work/plugins/H5Zblosc.c:160: blosc_set_local: Assertion `nelements == 7' failed.
./ard-pipeline/deployment/check-environment.sh: line 63: 121505 Aborted                 (core dumped) python3 <<EOF

# rest of embedded python code printed to terminal

Modifying check-environment.sh's embedded python to use hdf5plugin directly solves the core dump:

# The previous import of wagl should have initialised the filters.
print("Attempting hdf5 blosc compression...", end='', flush=True)
import h5py
import hdf5plugin  # for filter plugins
import tempfile
f = h5py.File(tempfile.mktemp('-test.h5'),'w')

# FIX: change compression from int to a plugin
dset = f.create_dataset("myData", (100, 100), compression=hdf5plugin.Blosc())
truth-quark commented 1 day ago

@uchchwhash @jeremyh are you aware of any reasons against using hdf5plugin for the compression arg?

jeremyh commented 1 day ago

hdf5plugin is what we use already — the comment higher up mentions it indirectly: # The previous import of wagl should have initialised the filters.

In wagl we import it here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/e841eb271a1bf5050394ca6365abb4302ecb8cc8/wagl/__init__.py#L3

So somehow it is no longer being included implictly by the wagl import?

jeremyh commented 1 day ago

Ah, I see — it's the compression id number that's the problem, not the import itself. Perhaps something has changed in a library upgrade and the hardcoded number is no longer accepted?

I never liked the "import to register plugins automatically" method, so that does sound better to me in theory .... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/b53ca81476f3cf4e96769a462fce95b3d5bd5383/wagl/hdf5/compression.py#L258

truth-quark commented 1 day ago

So somehow it is no longer being included implictly by the wagl import?

I'm not sure of the specifics - it's buried somewhere & I haven't delved into it.

Ah, I see — it's the number that's the problem, not the import itself.

Removing the int code & compression arg allowed the test to pass. Also, using hdf5plugin for the compression arg brought in more error handling. Deliberately breaking the code with compression=hdf5plugin & compression=hdf5plugin.Blosc (no func call) was raised exceptions, so better than a core dump (some failures crashed the conda env).

truth-quark commented 7 hours ago

.... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here:

https://github.com/OpenDataCubePipelines/ard-pipeline/blob/b53ca81476f3cf4e96769a462fce95b3d5bd5383/wagl/hdf5/compression.py#L258

I haven't run wagl yet to find out definitively.

In the meantime, I'll experiment with changing compression=<plugin> across the codebase to run that through the GitHub actions.

truth-quark commented 3 hours ago

.... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/b53ca81476f3cf4e96769a462fce95b3d5bd5383/wagl/hdf5/compression.py#L258

... I'll experiment with changing compression=<plugin> across the codebase ...

With some grep work, I can't find any evidence in the code that the blosc filter is being used. Compression mostly defaults to LZF filters.

Is blosc compression used in any production runs?

If not, is there any issue with modifying check-environment script to use the default LZF?