mpi-forum / mpi-forum-historic

Migration of old MPI Forum Trac Tickets to GitHub. New issues belong on mpi-forum/mpi-issues.
http://www.mpi-forum.org
2 stars 3 forks source link

RMA Notification #439

Open mpiforumbot opened 8 years ago

mpiforumbot commented 8 years ago

Originally by jdinan on 2014-08-07 09:04:45 -0500


Problem Statement

In passive target mode, notifying the target that data has been transmitted is currently inefficient. It requires sending additional messages after operations that are to be notified have been remotely completed.

Proposed Solution 1: Sync-and-Notify, Window Counter

Addition of new "synchronize-and-notify" routines:

int MPI_Win_flush_notify(int rank, MPI_Win win);
int MPI_Win_unlock_notify(int rank, MPI_Win win);
int MPI_Win_flush_all_notify(MPI_Win win);

int MPI_Win_get_notify(MPI_Win win, long count);
int MPI_Win_set_notify(MPI_Win win, long count);
int MPI_Win_wait_notify(MPI_Win win, long geq_value);

A notification counter is associated with the window, and is incremented at the target after the given passive target epoch has completed at the target (i.e. data is visible to the target process). Get, set, and wait functions are provided to enable a process to query the number of notifications it has received.

Criticism: Since the notification is separate from communication operations, e.g. put-and-notify, this can require two separate operations, which will not improve performance.

Proposed Solution 2: Op-and-Notify, Window Counter

Addition of new "communicate-and-notify" routines:

int MPI_Put_notify(..., MPI_Win win); /* Identical args as MPI_Put */
int MPI_Get_notify(... , MPI_Win win);
int MPI_Accumulate_notify(..., MPI_Win win);

int MPI_Win_get_notify(MPI_Win win, long count);
int MPI_Win_set_notify(MPI_Win win, long count);
int MPI_Win_wait_notify(MPI_Win win, long geq_value);

A notification counter is associated with the window, and is incremented at the target after the given RMA operation has completed at the target (i.e. data is visible to the target process). Get, set, and wait functions are provided to enable a process to query the number of notifications it has received.

Criticism: Only one counter per window.

Proposed Solution 3: Op-and-Notify, Matched Counter

Torsten's proposal. Details to come...

References

Reducing Synchronization Overhead Through Bundled Communication

mpiforumbot commented 8 years ago

Originally by jdinan on 2014-11-13 09:09:54 -0600


An alternative RMA notification interface was presented by Torsten Hoefler at the September, 2014 meeting. Slides are posted here: http://meetings.mpi-forum.org/secretary/2014/09/slides/hoefler-blitz.pdf

mpiforumbot commented 8 years ago

Originally by gropp on 2014-12-10 13:03:33 -0600


The WG finds these two ideas interesting and is interested in seeing more on this topic. There is, hover, still skepticism about the value of this idea. Specific issues include:

  1. Compelling use cases
  2. Existing practice, for example in OpenSHMEM, including alternatives
  3. Scalability analysis, particularly of flush_all_notify
  4. Implementation and performance issues on unordered networks
  5. Existing and expected support in high performance networks
mpiforumbot commented 8 years ago

Originally by jhammond on 2015-02-10 16:31:16 -0600


Torsten has addressed some of these issues in http://spcl.inf.ethz.ch/Publications/.pdf/notified-access-extending-rma.pdf. His proposed API is not the same, but some of the issues are agnostic. In particular, the paper discusses compelling use cases (1), alternative implementations in MPI (2) and the implementation on Cray XC30 (5), which has a network that favors dynamic routing (4).

mpiforumbot commented 8 years ago

Originally by rsthakur on 2015-06-03 15:11:58 -0500


From the June 2015 Forum meeting: Need a clear use case and a discussion of why send-recv is not enough for this.

mpiforumbot commented 8 years ago

Originally by gropp on 2015-09-25 08:38:35 -0500


From the Sept 2015 Forum meeting, we discussed the implementation issues with Torsten's proposal. Many of the issues relate to the processing needed for the event queue. Multiple counters might work as an alternative that may be sufficient for examples such as the Cholesky tasking implementation, and be easier to implement with network HW offload. This leaves the question of how to determine the number of counters, and when are they determined. Some applications may need an number of counters (or other notification objects) that is not known when a window is created.

mpiforumbot commented 8 years ago

Originally by jdinan on 2015-09-26 03:17:49 -0500


Attachment added: 09-2015 -- RMA Notified Access Implementation Discussion.pdf (266.7 KiB)