HDFGroup / hdf5

Official HDF5® Library Repository
https://www.hdfgroup.org/
Other
611 stars 250 forks source link

Chunk preallocation outside of dataset resize #3229

Open chacha21 opened 1 year ago

chacha21 commented 1 year ago

I use a chunked dataset for incremental I/O. When I need to append data, I enlarge the dataset. Chunks are automatically allocated as needed.

For performance, since I approximately know how much data will be appended, I would like to pre-allocate chunks so that enlarging the dataset will be (hopefully) cheaper.

But :

mattjala commented 1 year ago

There isn't currently any way to pre-allocate chunks before expanding a dataset. Extending and then shrinking the dataset won't leave anything allocated - if its extent is decreased, chunks outside the new boundaries are cleaned up at the time the dataset is resized.

Although if you're using the default settings for a chunked dataset, then space should be allocated incrementally upon writes, not upon the resize itself (see H5Pset_alloc_time) and so H5Dset_extent shouldn't take very long, at least not for that reason.

If allocating more data is taking too long, one workaround would be to create the dataset's dataspace with its initial dims equals to maxdims, and keep track of the current extent yourself. This would perform all the allocation at dataset creation, instead of doing additional allocations as more data is written.

chacha21 commented 1 year ago

Thanks for the clarification

There isn't currently any way to pre-allocate chunks before expanding a dataset. Extending and then shrinking the dataset won't leave anything allocated - if its extent is decreased, chunks outside the new boundaries are cleaned up at the time the dataset is resized.

That's what I thought; I don't know where it should be documented, but I did not find it.

Although if you're using the default settings for a chunked dataset, then space should be allocated incrementally upon writes, not upon the resize itself (see H5Pset_alloc_time) and so H5Dset_extent shouldn't take very long, at least not for that reason.

Good catch, and the doc seems pretty clear. In my case I see that the chunk allocation is defered to actual writing. But during a bulk write session, it does not make any difference, because the writes will always occur just after the H5Dset_extent, so allocating new chunks is still occuring at a high rate. My bulk writes can be hundreds of MB, so using a larger chunk size (I already boosted it to 4MB) is not a solution. For other usages of my datasets, an even larger chunk size would be sub-optimal.

If allocating more data is taking too long, one workaround would be to create the dataset's dataspace with its initial dims equals to maxdims, and keep track of the current extent yourself.

Yes, but I use maxdims Unlimited, because even if i roughly know how much will be written during one bulk session, my dataset has no strict boundary and can be increased again later. Thus I could not resort to this trick.

You could argue that even with an unbounded dataset D of current size S, I could prepare a bulk session of approximatively B bytes by extending D to S+B, then use a custom size indicator during the bulk, then truncate once the bulk is finished if I did not exceed B bytes. But my scenario won't allow that so easily. Actually, I use the HDF5 dataset as a storage backend of an sqlite database through the sqlite "VFS" feature involving callbacks. The dataset must behave like a file and introducing a "custom size indicator" during bulk sessions (big SQL inserts) would be a lot of code, easy to get wrong. Since sqlite has no feature to preallocate storage space, I wanted to rely on HDF5 chunk preallocation instead.

[edit] It might be associated to this thread or not, but my "bulk big writes" scenario also clashes with the chunk cache system and involves useless copies in memory (to the cache) before committing to storage (because the cache will keep being full during the bulk). A "write through cache" options might be useful, and could be related to advanced chunk management (different from H5D_WRITE_CHUNK, H5D_READ_CHUNK) https://github.com/HDFGroup/hdf5/issues/3230

mattjala commented 1 year ago

Chunk preallocation fits within the design of HDF5. I'll leave this open for now as an indication of interest in the feature, though it's not planned for implementation in the immediate future.