BlueBrain / HighFive

HighFive - Header-only C++ HDF5 interface
https://bluebrain.github.io/HighFive/
Boost Software License 1.0
696 stars 162 forks source link

feature: support Multi-dataset I/O #625

Open roblatham00 opened 2 years ago

roblatham00 commented 2 years ago

Is your feature request related to a problem? Please describe. The HDF5 library is close to releasing (in HDF5-1.13.3) "multi-dataset i/o", a feature allowing users to describe or schedule accesses to several datasets, then fire them all off at once.

The interface is described in the "RFC: New HDF5 API Routines for HPC Applications Read/Write Multiple Datasets in an HDF5 file" document, and has changed quite a bit over the years (it's been in "proposed" state for close to a decade...)

 herr_t H5Dread_multi(size_t count, 
    hid_t dset_id[], 
    hid_t mem_type_id[], 
    hid_t mem_space_id[],
    hid_t file_space_id[], 
    hid_t dxpl_id,
    void *buf[] /*out*/);

(write looks similar, except 'buf' is const and not an output parameter )

Describe the solution you'd like

A way to collect multiple slices and send them to a "write_multi" routine

Describe alternatives you've considered

Is it possible to extract the necessary bits of information from the C api to do this alongside highfive?

Additional context

The HDF group has an RFC describing the benefits of this approach https://www.hdfgroup.org/wp-content/uploads/2022/08/H5HPC_MultiDset_RW_IO_RFC_v7_20220523.pdf

The 1.13 series is classified as "experimental" and the interface might change (again) before stabilizing in the 1.14 release (currently jan 2023 but I am guessing that will slip)

1uc commented 2 years ago

Thank you, this look very interesting. Also in the context of a usecase we have.