HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
111 stars 39 forks source link

hsload inefficient for zero-filled datasets #35

Open jreadey opened 6 years ago

jreadey commented 6 years ago

hsload isn't inspecting chunks prior to writing them to the server. This results in the server needlessly allocated chunks on the server and increased file size.

hsload should inspect each chunk and skip the write if the chunk is all zeros (or whatever the fill value is).