devreal commented 1 year ago

Problem

The text for MPI_Abort is not clear on whether a call to MPI_Abort is allowed to return. The text states that:

This routine makes a “best attempt” to abort all MPI processes in the group of comm.

Does that mean that even the calling process is not required to be aborted? Or that some processes not calling MPI_Abort may survive?

This function does not require that the invoking environment take any action with the error code. However, a Unix or POSIX environment should handle this as a return errorcode from the main program.

Not sure if this text gives more information, because what is "the invoking environment" actually?

Curiously, MPI_Abort has an integer return type which suggests that it may return. But what is the application supposed to do then? And what may be returned from MPI_Abort?

As for why this is a bad, consider the following example:

int foo(int x) {
  if (x >= 0) {
    return do_something_useful(x);
  }
  // error case, go up in flames
  MPI_Abort(MPI_COMM_WORLD);
  abort(); // make sure there will be flames
}

The abort() call is needed both because we don't know whether MPI_Abort might return and because the compiler will complain about a missing return from the function.

For comparison: C abort returns void and explicitly never returns. Thus, stdlib.h has the following definition for abort:

extern void abort (void) __attribute__ ((__noreturn__));

Proposal

Clearly state whether MPI_Abort may ever return or not.

Changes to the Text

My preference: add a sentence that states

A call to MPI_Abort does not return.

Alternatively, clearly state that MPI_Abort may return, in which cases, and what users are allowed to do then (probably not call any MPI functions anymore).

Impact on Implementations

If we disallow MPI_Abort to ever return, implementations may annotate the function signature with __attribute__ ((__noreturn__)) if supported. Changing the return type will require changes in implementations.

Impact on Users

Users don't have to care about what happens if MPI_Abort ever returns. The additional abort() call in the example is not needed.

References and Pull Requests

jeffhammond commented 1 year ago

Yes, MPI_Abort should be allowed to return. If I call MPI_Abort(MPI_COMM_WORLD, 1) because MPI_Barrier failed due to the network cables being unplugged, MPI_Abort is definitely not going to do its job on MPI_COMM_WORLD.

In this case, MPI_Abort should return an error so that the user can call email_vendor_tech_support("my computer is hosed").

jeffhammond commented 1 year ago

I think after MPIAbort fails, one should be allowed to call things like Comm{rank,size} and MPIError*, as well as to call MPI_Abort again, presumably with a different communicator. Of course, all of these may return errors due to MPI being in a bad state.

devreal commented 1 year ago

I would argue that if I call MPI_Abort I tell MPI that a) I cannot continue because of an application error, and b) I expect MPI to take care of ending my process. Otherwise I would find a more graceful way of shutting down. And if MPI can only kill the calling process then there is nothing I as a user can do differently. Eventually, the other processes will come down when they realize that I am gone (or my node is gone from the network).

I think this is a question for the FT folks (@abouteiller @wesbland).

jeffhammond commented 1 year ago

I object to the suggestion that MPI_Abort is synonymous with erroneous termination of MPI. Yes, that is one of the primary use cases, but not the only one.

I use MPI_Abort for asynchronous termination, including successfully asynchronous termination. OpenSHMEM has shmem_global_exit, which allows one PE to stop the application by itself. I implemented this in OSHMPI using MPI_Abort. I did the same in MARPN because it was not practical to synchronize the active processes in order to call MPI_Finalize.

I haven't thought about it much and might be wrong, but if I'm using dynamic processes, and I want a subset of processes to shut down in the same way that I can startup an additional set of processes, how else can I allow a subset to terminate than MPI_Abort? It is the only termination function that takes a communicator argument.

devreal commented 1 year ago

I use MPI_Abort for asynchronous termination, including successfully asynchronous termination.

Fair point, I see the use for that. Regardless, the same applies: you expect your process to end (none of the callsites in MARPN checks the return value of MPI_Abort ;))

I tried to find users of MPI_Abort and pretty much every place I found assumes that MPI_Abort does not return. An incomplete list (hand-picked, filtered obvious class work, test suites, and MPI implementations): A++, AMREX, HDF5; LAMMPS, which also calls exit() after MPI_Abort in some cases). This of course is not a comprehensive study but it suggests that most applications expect MPI_Abort to not return and/or want to kill the current process anyway. At the very least, the standard does not make it clear that MPI_Abort may in fact return (which to me is absolutely counter-intuitive).

I haven't thought about it much and might be wrong, but if I'm using dynamic processes, and I want a subset of processes to shut down in the same way that I can startup an additional set of processes, how else can I allow a subset to terminate than MPI_Abort? It is the only termination function that takes a communicator argument.

If you cannot send them a message asking them to disconnect and exit gracefully then MPI_Abort could be an option. There is no guarantee (in the standard) that this won't take down your whole job though...

RolfRabenseifner commented 1 year ago

The text in MPI-4.0 page 514, lines 55-37 is very clear. Especially the first line:

This routine makes a "best attempt" to abort all MPI processes in the group of comm.

@devreal asks

Curiously, MPI_Abort has an integer return type which suggests that it may return. But what is the application supposed to do then? And what may be returned from MPI_Abort?

The answer is not obvious, but very simple: If the application has an error handler with "errors return" activ for the case that the In-argument comm is invalid (MPI_COMM_NULL or a value-copy of a former communicator handle that is already freed with MPI_COMM_FREE), then the routine may return an error code of the error class MPI_ERR_COMM.

I hope that this answers your major question.

I other words, your text proposal

A call to MPI_Abort does not return.

is therefore not correct, but a clarification may be helpful because the discussion here in the MPI forum could not resolve the question within 3 days. A possibility may be to add after MPI-4.0 page 514, line 37:

If the call does not fail (for example due to an invalid \mpiarg{comm} argument), i.e., if the internally determined return code (\mpiarg{ierror} in Fortran) from the routine would be \mpi_const{MPI_SUCCESS}, then the call will not return because the calling process is aborted.

@devreal This answers also the question behind your code example in the description: Yes, the abort() is needed, for the case "errors return" is active and for example MPI_ERR_UNKNOWN would be returned. With default error handling, it is not needed.

abouteiller commented 1 year ago

From all the usages I have ever seen it is always assumed by end-users that the procedure doesn't return. I will go on and verify the actual text, in particular the case of intercoms may be more muddied.

I do agree that abort is not only used for error handling. It is a useful tool for that, but it can be used for a variety of reasons.

The best effort aspect is normally referring to aborting -only- the processes of comm (in opposition to aborting -all- processes), but text may be unclear about that(?)

Long way to say that what Rolf proposes is reasonable and I am happy to discuss this during the Errh/FT WG and flesh something out accordingly.

jprotze commented 1 year ago

From all the usages I have ever seen it is always assumed by end-users that the procedure doesn't return. I will go on and verify the actual text, in particular the case of intercoms may be more muddied.

Since many codes rely on the default error-handler, this assumption will hold for these codes. MPI_Abort should only ever return in case of an error which will terminate the application with the default error-handler.

Among the codes linked by @devreal only A++ seems to handle return codes from MPI functions. Without looking closer into the codes, I would assume that the other codes rely on the default error-handler.

jeffhammond commented 1 year ago

I agree with this proposal, without the example:

If the call does not fail, i.e., if the internally determined return code (\mpiarg{ierror} in Fortran) from the routine would be \mpi_const{MPI_SUCCESS}, then the call will not return because the calling process is aborted.

I strenuously object to adding C11 _Noreturn to MPI_Abort because it is UB to return from such a function. MPI_Abort has a return code and there is at least one case where it will return an error.

And yes, this means that some codes will write MPI_Abort(..); abort(); or MPI_Abort(..,rc); exit(rc); if they want to adorn an MPI termination function wrapper with _Noreturn.

@devreal

none of the callsites in MARPN checks the return value of MPI_Abort ;)

I was aware of the machine-specific behavior of MPI_Abort on Blue Gene/Q, and thus assumed that when I wrote that machine-specific library 😉

If you cannot send them a message asking them to disconnect and exit gracefully then MPI_Abort could be an option. There is no guarantee (in the standard) that this won't take down your whole job though...

I do not want to have to implement an active-message system (i.e. poll for a message containing the terminate command) in order to do asynchronous termination.

Yes, MPI_Abort may abort MPI_COMM_WORLD no matter its argument, as Blue Gene/Q did, but I assume that a high-quality implementation of dynamic processes would include MPI_Abort that only aborts the given communicator.

devreal commented 8 months ago

Closing this. There are good reasons for MPI_ABORT to return.

bosilca commented 8 months ago

No, there are exactly 0 valid reasons for it to return. From a user perspective calling this function means an intent to end all processes part of the group(s) associated with the communicator (and this always includes the local process). That the function globally succeed would be desirable, but the local process should be gone in all cases.

Why are we opening the door to allowing implementations to report NOT_IMPLEMENTED while still claiming full MPI support.

jeffhammond commented 8 months ago

MPI_Abort is not in Table 11.1, so if it is called before initialization when MPI_ERRORS_RETURN is set as the default error handler, the implementation should return the error associated with incorrect usage of an MPI function and not terminate the problem.

While this is not a likely scenario, it is as likely as implementations reporting NOT_IMPLEMENTED while still claiming full MPI support.

devreal commented 8 months ago

While this is not a likely scenario, it is as likely as implementations reporting NOT_IMPLEMENTED while still claiming full MPI support.

What is this NOT_IMPLEMENTED you speak of? I don't see that anywhere in the standard.

MPI_Abort is not in Table 11.1, so if it is called before initialization when MPI_ERRORS_RETURN is set as the default error handler, the implementation should return the error associated with incorrect usage of an MPI function and not terminate the problem.

That is a choice to make here. If I tell MPI that I have encountered a fatal error then the last thing I want is for MPI to come back to me and say "something is wrong, try again". All that's left is to call system abort(), which the MPI implementation may do as well. The trade-off here is between guarantees for well-formed programs and extra workload for non-compliant applications. If you are non-compliant in telling MPI to abort MPI should still just abort...

bosilca commented 8 months ago

@jeffhammond I see your argument, but calling MPI_Abort before MPI_Init would limit the scope of the call to either NULL, SELF or WORLD communicators (only the predefined). NULL and SELF sound like a bad use case to support, and we can handle them in the same way as WORLD. For WORLD it makes no difference if the MPI_Abort does the abort on WORLD or if it returns an error that will trigger the default handler (which will also abort the entire WORLD). So, overall it makes sense to add MPI_ABORT to the list of functions that can be called before MPI_INIT, in which case it will abort the WORLD.

@devreal any error that can be returned, such as MPI_ERR_ARG or MPI_ERR_INTERN.

mpi-forum / mpi-issues

Is MPI_Abort allowed to return? #670

Problem

Proposal

Changes to the Text

Impact on Implementations

Impact on Users

References and Pull Requests