This fixes some issues we've encountered with exporting some type of data to hdf5. The unit-tests have been scaled the reproduce the issues via minimal data, but the same issues are encountered with the default_chunk_size on larger datasets.
I suspect this is due to a combination of chunk_size and amount of missing/masked values in a particular chunk.
I do not know the exact origin of the errors at this point, so the tests have rather generic names.
Each of the unit-tests raises a different error, which is why I created separate tests rather than one in which the amount of data is varied.
Exporting the same data under the same conditions (i.e. chunk_size) to arrow or parquet format works just fine.
Checklist:
[x] make unit-tests
[x] make tests pass
[x] (optional) rename tests with more informative/relevant names
This fixes some issues we've encountered with exporting some type of data to hdf5. The unit-tests have been scaled the reproduce the issues via minimal data, but the same issues are encountered with the
default_chunk_size
on larger datasets.I suspect this is due to a combination of
chunk_size
and amount of missing/masked values in a particular chunk. I do not know the exact origin of the errors at this point, so the tests have rather generic names.Each of the unit-tests raises a different error, which is why I created separate tests rather than one in which the amount of data is varied.
Exporting the same data under the same conditions (i.e.
chunk_size
) to arrow or parquet format works just fine.Checklist: