mpiwg-sessions / sessions-issues

3 stars 1 forks source link

do we need a sessions world equivalent of MPI Abort #11

Open hppritcha opened 5 years ago

hppritcha commented 5 years ago

at the WG meeting today Dan described the usefulness of a MPI_Sessions_abort like function.

dholmes-epcc-ed-ac-uk commented 5 years ago

Previously I was describing how we might want to abort "all connected processes" for a session. However, we have not found a good way to define that for MPI processes that have references to multiple sessions.

Note: the error handler MPI_ERRORS_ABORT now acts only on the local MPI process (see issue #12).

By symmetry, should this mythical MPI_Session_Abort function act only on the local MPI process? That would make it equivalent to MPI_ABORT(MPI_COMM_SELF), except that we don't have MPI_COMM_SELF - so we'd have to get the user to create one by creating a comm from the "mpi://self" process set. Thus, it could be a useful short-hand for:

int MPI_Session_abort(MPI_Session session, int errorCode) {
    MPI_Group myTempGroup;
    MPI_Comm mySelfComm;
    MPI_Group_from_pset(session, "mpi://self", myTempGroup);
    MPI_Comm_create_from_group(myTempGroup, mySelfComm);
    MPI_Abort(mySelfComm, errorCode);
    return errorCode;
}

Suggestion: we should drop this - at least for now.

hppritcha commented 2 years ago

Discussed in the Sessions WG 7/25/22 meeting. Dan recalled discussion of scope: group, universe, session (connected process). We now have a definition of all connected processes for sessions so this functionality may be necessary for FT/Sessions. Drop if FT WG wants to have a session scope for error handling otherwise we'll drop.

hppritcha commented 2 years ago

put the mpi-5 label back on per discussion at the 08/01/22 sessions wg meeting