HDFGroup / vol-rest

HDF5 REST VOL Connector
Other
5 stars 8 forks source link

Support for H5D_ReadMulti/WriteMulti #38

Closed jreadey closed 11 months ago

jreadey commented 1 year ago

Create a design plan for implementing this functionality in the rest vol. Things to check:

mattjala commented 1 year ago

Does the VOL interface support this?

Yes, read/write multi requests from the HDF5 API have all their information passed down to the VOL in a way that enables it to perform the multi operations. Right now, the VOL just fails with an "unsupported" message if more than one dataset is provided.

Are there synchronization locks that would force the call to be as slow as a set of serial requests?

The DAOS (Distributed Asynchronous Object Storage) VOL exists and works fine, so it is at least possible for VOLs to implement asynch operations, despite any locks that might exist in the HDF5 library.

Can curl be invoked multiple times from one process?

Yes, curl multi handles allow multiple transfers from a single thread.

What is the the parallelization strategy? Thread pool, async curl requests, etc.

Since the main bottleneck for performance with the REST VOL is almost always going to be waiting for requests to the server, having the requests be performed asynchronously with CURL should sufficient for a pretty major speed increase.

The current plan is to set up a curl multi handle upon the invocation of a read_multi/write_multi request, give it the information about the individual read/write requests to each dataset, start performing all the requests, and wait for them all to finish before proceeding.

mattjala commented 11 months ago

Implemented in #39