mpi-forum / mpi-forum-historic

Migration of old MPI Forum Trac Tickets to GitHub. New issues belong on mpi-forum/mpi-issues.
http://www.mpi-forum.org
2 stars 3 forks source link

Persistent Collective Operations for MPI 4.0 #466

Open mpiforumbot opened 8 years ago

mpiforumbot commented 8 years ago

Originally by tony on 2014-12-10 19:35:07 -0600


This ticket proposes non-blocking persistent collective operations be added to the standard.

For each API that currently supports a request as an out argument of a non-blocking collective operation, MPI will support an _init version.

Example: MPI_Bcast() [blocking, MPI-1] -> MPI_Ibcast(...,request[,ierr]) [MPI 3.x] -> MPI_Bcast_init(...,request[,ierr]) [new]

The attached text is prototypical standards draft text.

Some quick notes:

MPI persistent collective operation _inits are initialization calls and follow all the ordering rules already existing for posing nonblocking collective operations (this implies _init operations are local and non-blocking). Remember that the point-to-point init functions disallow communication whereas the persistent collective inits may communicate but must not rely on state of other MPI processes.

The request argument is an out argument in each process of the calling group of the communicator that can be used zero or more times to start a collective operation. Each such request must be started in all the processes of the underlying group of the communicator (they are deemed active after this). Each request must be completed (making them inactive) before another start is permitted. Each process in the group of the communicator must complete the operation.

We require that starts across a group for persistent collective operations serialize to the same order. This makes using MPI_STARTALL erroneous in certain situations. We hope to remove this restriction in the future with a separate ticket.

MPI_Test/MPI_Wait operations (all) give completion information without destroying the persistent request. MPI_Request_free() is used as with persistent point-to-point operations to free up an inactive persistent collective operation. This is the same behavior as for persistent point-to-point requests.

mpiforumbot commented 8 years ago

Originally by tony on 2014-12-10 19:36:49 -0600


Attachment added: coll-text-10dec14.pdf (536.0 KiB) This is the iniital standards text for persistent collective

mpiforumbot commented 8 years ago

Originally by tony on 2015-05-09 10:52:03 -0500


Small update to text to reflect the "init operations are local logic."

mpiforumbot commented 8 years ago

Originally by tony on 2015-05-09 10:55:31 -0500


Attachment added: coll-text-03mar15.pdf (541.1 KiB) March 3 update of the text

mpiforumbot commented 8 years ago

Originally by dholmes on 2015-05-09 16:20:46 -0500


Attachment added: coll - 9 May 2015.pdf (537.7 KiB) Ready for formal reading at June 2015 meeting

mpiforumbot commented 8 years ago

Originally by jhammond on 2015-05-11 16:42:12 -0500


Minor:

Major:

mpiforumbot commented 8 years ago

Originally by dholmes on 2015-05-18 06:35:12 -0500


Vector arguments will be "captured" in the same manner as for non-blocking collective operations. This is explained in MPI 3.0 on page 197 lines 12-15: Once initiated, all associated send buffers and buffers associated with input arguments (such as arrays of counts, displacements, or datatypes in the vector versions of the collectives) should not be modified, and all associated receive buffers should not be accessed, until the collective operation completes. This implies that the MPI library will store the pointer to the vector arguments for later use and the user will not free (or even access) the associated memory until the operation is complete. In the case of a persistent operation "complete" means that the request is freed by the user via MPI_REQUEST_FREE (or implicitly freed by MPI during MPI_FINALIZE). No copying of arguments by MPI, with associated memory management in MPI, is needed.

We should be explicit about this intention in the proposed text.

mpiforumbot commented 8 years ago

Originally by tony on 2015-05-18 11:46:24 -0500


We have addressed all major concerns. The first minor concern also is addressed. The second minor concern has not been addressed because we copied the headings from non-blocking collective instead of from blocking collective, so we are internally consistent with that set of headings. Further alignment of all three sets of headers is a possible good, but we didn't think we could make that judgement call.

Replying to jhammond:

Minor:

  • Make sure to remember to strike "Nonblocking collective requests are not persistent." from section 5.12 (page 57, line 27 in your latest attachment).
  • "5.13.9 Persistent Reduce-Scatter with Equal Blocks Request" appears to result from copy-and-paste-fail in LaTeX.

Major:

  • How are vector arguments captured? Do we require the implementation to copy the vector itself, store internally, and associate with the request object, or is the interface intended to capture the pointer (I have not thought about this from a Fortran perspective yet) and expect that the user will not free that buffer until the request handle is freed? I think we should think hard about the requirement for the implementation to copy vector arguments internally, although it is likely that this is the only sane way to do it.
  • I will not vote for this unless it includes neighborhood collectives as well. If you're waiting for preliminary feedback on the big picture idea before implementing those changes in LaTeX, then I completely understand. Please just make sure to say that it is your intent to do neighborhood collectives in the final version.
mpiforumbot commented 8 years ago

Originally by tony on 2015-05-18 11:48:30 -0500


Attachment added: coll-18may15.pdf (538.0 KiB) This is the revised collective chapter (addresses Major Hammond issues, see note on ticket)

mpiforumbot commented 8 years ago

Originally by tony on 2015-05-18 11:49:01 -0500


Attachment added: topol-18may15.pdf (374.1 KiB) We also had to modify the Topology Chapter for Neighborhood Collectives

mpiforumbot commented 8 years ago

Originally by tony on 2015-05-18 11:55:49 -0500


We have added two files - coll-18may15.pdf and topol-18may15.pdf, reflecting the changes needed to complete the proposal and address the major concerns from Jeff Hammond (thank you).

Note that we have made additions

NOTE: There is another change that we must make, but did not feel comfortable doing once we prototyped the LaTeX, namely, in Section 3.9 of the the current standard, Point-to-Point Chapter, Looking at p73, the current text says

"A persistent communication request is created using one of the five following calls. These calls involve no communication." As part of this ticket, we also want to change these sentences to read as follows:

"A persistent point-to-point communication request is created using of the five following calls (persistent collective communication requests are discussed in Section X.YZ). Persistent point-to-point communication initialization calls involve no communication.

NOTE: This is a cross-reference to new logic, that makes no change in the standard with regard to the existing functionality described there, but specifically differentiates between extant point-to-point requests produced by point-to-point initialization functions and the new collective ones. It could be considered a ticket 0 change, but we didn't like how the LaTeX worked and needed not to check-in the chapter here.

mpiforumbot commented 8 years ago

Originally by tony on 2015-06-02 08:47:09 -0500


Attachment added: PersistenceReading1-MPI-Forum-2jun15_v3.pptx (50.8 KiB) PPT of the presentation for Ticket read on June 2, 2015 at Chicago Meeting

mpiforumbot commented 8 years ago

Originally by puri on 2015-06-03 16:36:14 -0500


Attachment added: coll-06jun15.pdf (538.5 KiB) Updated document after Forum feedback

mpiforumbot commented 8 years ago

Originally by puri on 2015-06-03 16:50:25 -0500


The introductory text for this ticket is updated following the feedback from the forum at the June 2015 meeting in Chicago. The updated document is also attached to this ticket with the new changes.