Closed DerNils-git closed 1 year ago
In my opinion
MPI_Comm
is a resource to be treated based on RAII.
I don't agree on this point. Resource management should be symmetrical. Since MPI is initialized outside of openPMD, it must be managed and destroyed outside of openPMD. openPMD is generally used as one of many MPI-aware components in an MPI code. In such workflows, it is not openPMD's job to manage MPI, but the application's job. This is the typical behavior of application libraries that are MPI aware, HDF5 and ADIOS2 both behave in a similar way.
Note that openPMD is an application library and not a framework, and as such it is not its job to govern resources outside of what it allocates and needs for itself.
What is the behavior if the communicator that was given to the series is released before
series.flush()
is called?
Undefined, generally a segmentation fault. Many more equivalent problems would be caused if openPMD silently finalized MPI upon destruction.
The situation might be a bit different if the MPI_Communicator
itself were a RAII-managed resource that could be properly moved into openPMD. But it does not have these semantics (which I am subjectively glad about, as RAII for nontrivial tasks often causes many problems with respect to error handling) and I don't want to try outsmarting the user / MPI here.
I agree. OpenPMD should definitely not finalize or initialize MPI itself. Thats the applications responsibility.
The situation might be a bit different if the
MPI_Communicator
itself were a RAII-managed resource that could be properly moved into openPMD. But it does not have these semantics (which I am subjectively glad about, as RAII for nontrivial tasks often causes many problems with respect to error handling) and I don't want to try outsmarting the user / MPI here.
That is actually what I meant. E.g. it would be possible to duplicate the communicator on construction of the series and free the duplicated communicator on destruction of the series. This still leaves move and copy assignments to be managed. But it is a fair point trying to follow the behavior of libraries like HDF5 and ADIOS2.
If openPMD is build using MPI a
Series
can be created with aMPI_Comm
.In my opinion
MPI_Comm
is a resource to be treated based on RAII. Looking into the HDF5 implementationMPI_Comm
is simply stored withinstd::optional<MPI_Comm>
. Therefore I guess no ownership of the given communicator is taken.What is the behavior if the communicator that was given to the series is released before
series.flush()
is called? Could this lead to hard to trace error messages or is this behavior captured somehow?This is merely a detail but could get important if a implementors are not careful about used resources.