Closed djhoese closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 96.05%. Comparing base (
6860030
) to head (5e27be4
). Report is 400 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚨 Try these New Features:
Totals | |
---|---|
Change from base Build 10404388488: | 0.001% |
Covered Lines: | 52408 |
Relevant Lines: | 54506 |
This avoids a bug in dask or cloudpickle that alters the state of the pyhdf SDS object in some way making it unusable. The PR in dask that started triggering this was in https://github.com/dask/dask/pull/11320. My guess is that the repeated pickling/serialization is losing some hidden state in the pyhdf SDS object and then pyhdf or HDF-C no longer knows how to work with it.
We could register SDS with normalize_token in dask or we could just do what I do in this PR and come up with the token/name ourselves. Note this is the name for the dask array/task. This is similar to work done in the past by @mraspaud for the HDF5 utility package to make sure things are consistent across file variable loads.
One alternative to the PR as it is right now would be to copy
from_sds
as a method on the HDF utility class so it knows how to useself.filename
automatically for the tokenizing. Also I'm realizing maybe the src_path shouldn't be required as it is only used if thename
kwarg isn't provided. Thoughts @mraspaud @gerritholl ?