Open truth-quark opened 1 day ago
@uchchwhash @jeremyh are you aware of any reasons against using hdf5plugin
for the compression arg?
hdf5plugin is what we use already — the comment higher up mentions it indirectly: # The previous import of wagl should have initialised the filters.
In wagl we import it here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/e841eb271a1bf5050394ca6365abb4302ecb8cc8/wagl/__init__.py#L3
So somehow it is no longer being included implictly by the wagl import?
Ah, I see — it's the compression id number that's the problem, not the import itself. Perhaps something has changed in a library upgrade and the hardcoded number is no longer accepted?
I never liked the "import to register plugins automatically" method, so that does sound better to me in theory .... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/b53ca81476f3cf4e96769a462fce95b3d5bd5383/wagl/hdf5/compression.py#L258
So somehow it is no longer being included implictly by the wagl import?
I'm not sure of the specifics - it's buried somewhere & I haven't delved into it.
Ah, I see — it's the number that's the problem, not the import itself.
Removing the int
code & compression
arg allowed the test to pass. Also, using hdf5plugin
for the compression arg brought in more error handling. Deliberately breaking the code with compression=hdf5plugin
& compression=hdf5plugin.Blosc
(no func call) was raised exceptions, so better than a core dump (some failures crashed the conda env).
.... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here:
I haven't run wagl
yet to find out definitively.
In the meantime, I'll experiment with changing compression=<plugin>
across the codebase to run that through the GitHub actions.
.... but if a raw number doesn't work, does wagl itself work? I can see the blosc int used here: https://github.com/OpenDataCubePipelines/ard-pipeline/blob/b53ca81476f3cf4e96769a462fce95b3d5bd5383/wagl/hdf5/compression.py#L258
... I'll experiment with changing
compression=<plugin>
across the codebase ...
With some grep
work, I can't find any evidence in the code that the blosc
filter is being used. Compression mostly defaults to LZF
filters.
Is blosc
compression used in any production runs?
If not, is there any issue with modifying check-environment
script to use the default LZF
?
On my local system, after installing
conda
& building the environment as per the README. Runningdeployment/check-environment.sh
failed here:https://github.com/OpenDataCubePipelines/ard-pipeline/blob/ea84164ab0cd4fb13188984bfd4332d7eb6b59e1/deployment/check-environment.sh#L55
The output:
Modifying
check-environment.sh
's embedded python to usehdf5plugin
directly solves the core dump: