mpi-forum / mpi-issues

Tickets for the MPI Forum
http://www.mpi-forum.org/
67 stars 8 forks source link

persistent vector collectives and MPI_Request_free are confusing or contain a contradiction #688

Closed jeffhammond closed 1 year ago

jeffhammond commented 1 year ago

Problem

As best I can tell from the following, it is legal to call MPI_Request_free - a local operation - on an active persistent all-to-all-V/W operation, and then free the vector arguments.

I do not see how this is implementable without copying all the vector arguments, but avoiding exactly that is the whole point of the requirement that they not be freed until the operation is in a state to not need them anymore.

§3.7

For a request representing a nonblocking point-to-point or a persistent point-to-point operation, it is permitted (although strongly discouraged) to call MPI_REQUEST_FREE when the request is active. In this special case, MPI_REQUEST_FREE will only mark the request for freeing and MPI will actually do the freeing stage of the associated operation later.

Takeaway: _it is permitted to call MPI_REQUEST_FREE on the request associated with a persistent point-to-point operation._

§6.12

It is erroneous to call MPI_REQUEST_FREE or MPI_CANCEL for a request associated with a nonblocking collective operation. Nonblocking collective requests created using the APIs described in this section are not persistent. However, persistent collective requests can be created using persistent collective operations described in Sections 6.13 and 8.8.

§6.13

Initialization calls for MPI persistent collective operations are non-local and follow all the existing rules for collective operations, in particular ordering; programs that do not conform to these restrictions are erroneous. After initialization, all arrays associated with input arguments (such as arrays of counts, displacements, and datatypes in the vector versions of the collectives) must not be modified until the corresponding persistent request is freed with MPI_REQUEST_FREE.

Takeaway: _MPI_REQUEST_FREE must occur before the vector arguments of a persistent collective can be freed._

Here is an example code, which I believe to be legal, which frees the vector arguments while the operation is still active, and seems destined to fail if the implementation has not copied the vector arguments, since the odd ranks will not have access to them when the even ranks get around to making progress on the operation (assuming the common active-target progress model).

...
MPI_Datatype *stypes = ..;
MPI_Datatype *rtypes = ..;
MPI_Alltoall_init(sendbuf, sendcounts, sdispls, sendtypes, recvbuf, recvcounts, rdispls, recvtypes, comm, MPI_INFO_NULL, &request);
MPI_Start(&request);
if (rank % 2 == 0) {
  sleep(60);
  MPI_Wait(&request);
}
MPI_Request_free(&request); // does not complete the operation, because it is local
free(stypes);
free(rtypes;

Proposal

It is a very bad idea to allow MPI_REQUEST_FREE on an active nonblocking or persistent collective. We should disallow it.

§3.7 is the place that needs to change.

Changes to the Text

TODO

Impact on Implementations

I have no idea how implementations are supporting the above, unless they are copying all of the vector arguments, which they are loath to do.

Impact on Users

Fixing this the right way is backwards incompatible but persistent collectives are not widely implemented and therefor unlikely to be used much. Furthermore, it is likely that users of these features are not calling MPI_REQUEST_FREE on active requests.

References and Pull Requests