Open hppritcha opened 5 years ago
Previously I was describing how we might want to abort "all connected processes" for a session. However, we have not found a good way to define that for MPI processes that have references to multiple sessions.
Note: the error handler MPI_ERRORS_ABORT now acts only on the local MPI process (see issue #12).
By symmetry, should this mythical MPI_Session_Abort
function act only on the local MPI process?
That would make it equivalent to MPI_ABORT(MPI_COMM_SELF)
, except that we don't have MPI_COMM_SELF
- so we'd have to get the user to create one by creating a comm from the "mpi://self" process set. Thus, it could be a useful short-hand for:
int MPI_Session_abort(MPI_Session session, int errorCode) {
MPI_Group myTempGroup;
MPI_Comm mySelfComm;
MPI_Group_from_pset(session, "mpi://self", myTempGroup);
MPI_Comm_create_from_group(myTempGroup, mySelfComm);
MPI_Abort(mySelfComm, errorCode);
return errorCode;
}
Suggestion: we should drop this - at least for now.
Discussed in the Sessions WG 7/25/22 meeting. Dan recalled discussion of scope: group, universe, session (connected process). We now have a definition of all connected processes for sessions so this functionality may be necessary for FT/Sessions. Drop if FT WG wants to have a session scope for error handling otherwise we'll drop.
put the mpi-5 label back on per discussion at the 08/01/22 sessions wg meeting
at the WG meeting today Dan described the usefulness of a MPI_Sessions_abort like function.