Closed 1uc closed 9 months ago
Interesting design. As I understand you are implementing a collective reader already, so in Neurodamus we would need basically to call make_collective_reader
and use it as argument in calls to the config which return storage. I think this is fine.
Nice! If I get this correct, we can just pull out the collective read interface into it's own library?
Yes, almost certainly, because mpi4py is a build dependency; and one can't have optional build dependencies (nicely).
This PR has been heavily reworked since the previous round of discussions. Eventually it will only include the Hdf5Reader
API. The two commits that refactor libsonata have been split into their own PRs. The implementation of an MPI-IO reader has been moved to a separate PR (until it can be moved to its own repository).
The questions for review are:
getAttribute
and similar? Currently, I can only think of collective
which would be used to specify if the call is collective. Things like block size, I fell would go into the reader.Hdf5Reader
.The next review would include question like:
libsonata[mpi]
? The advantage is that libsonata
can control the version of libsonata_mpi
it's compatible with, which is useful if the API of Hdf5Reader
needs to change.libsonata_mpi.make_collective_reader
in libsonata
? The advantage is that one could then always do libsonata.make_collective_reader
which would always succeed. If libsonata_mpi
is available it will return a collective reader, otherwise it'll return the default reader.With the above, code written to use collective IO, should automatically run even if the user can't install mpi4py
or libsonata_mpi
. Therefore, users can avoid the pain of installing the difficult dependencies; and devs don't need to worry about making it easy for their users.
Last chance for comments @matz-e / @ferdonline; otherwise I'll approve this at the end of the day.
@mgeplf we're in a reasonably good place to hold off on merging this. With all the preliminary work out of the way, we'll not suffer too much from merge conflicts and we can work on multiple other things (like cleaning up a bit more after #319).
We only need this in once we have it fully working and tested in neurodamus. Until then retaining the freedom to change details in might be nice.
The idea is that one injects call-backs for reading selections from datasets. These callbacks implement:
The default calls
_readSelection
. Separate readers can implement variants suitable for their purpose, e.g. MPI-IO. The advantage is that the reader has control over both the required collective semantics; and setting HDF5 properties. This allows some readers to have non-collective behaviour, such as short-circuiting for empty selections. While others implement strictly collective reading of datasets.Additionally, we can create a suite of readers for different usecases, e.g. MPI-IO for neurodamus, aggregating reader for serial workloads with GPFS (and a separate one for LUSTER if need be), the default implementation because it minimizes number of bytes read.