Closed jreadey closed 1 year ago
My initial test actually showed a slight slowdown when I tried it on the ATL03 file - 25.3 minutes with get_chunk_info
, 26.8 minutes with chunk_iter
.
Try with the cloud-optimized version of that file.
Try with the cloud-optimized version of that file.
Are you talking about using a paging strategy with page size 4-8 Mb? The ATL file is using the H5F_FSPACE_STRATEGY_FSM_AGGR
strategy, and the documentation says the strategy and page size are immutable for an already created file. If there are external tools that can alter this, I'm not aware of them.
Implemented in HDFGroup/h5pyd#148
Use H5Dchunk_iter rather than chunk_info in: https://github.com/HDFGroup/h5pyd/blob/master/h5pyd/_apps/utillib.py. Testing shows that this can have a speed up of over 500x for datasets with large number of chunks. Verify the hdf5lib version and fall back to chunk_info if H5Dchunk_iter is not available.