Open mpiforumbot opened 8 years ago
Originally by jdinan on 2014-11-13 09:09:54 -0600
An alternative RMA notification interface was presented by Torsten Hoefler at the September, 2014 meeting. Slides are posted here: http://meetings.mpi-forum.org/secretary/2014/09/slides/hoefler-blitz.pdf
Originally by gropp on 2014-12-10 13:03:33 -0600
The WG finds these two ideas interesting and is interested in seeing more on this topic. There is, hover, still skepticism about the value of this idea. Specific issues include:
Originally by jhammond on 2015-02-10 16:31:16 -0600
Torsten has addressed some of these issues in http://spcl.inf.ethz.ch/Publications/.pdf/notified-access-extending-rma.pdf. His proposed API is not the same, but some of the issues are agnostic. In particular, the paper discusses compelling use cases (1), alternative implementations in MPI (2) and the implementation on Cray XC30 (5), which has a network that favors dynamic routing (4).
Originally by rsthakur on 2015-06-03 15:11:58 -0500
From the June 2015 Forum meeting: Need a clear use case and a discussion of why send-recv is not enough for this.
Originally by gropp on 2015-09-25 08:38:35 -0500
From the Sept 2015 Forum meeting, we discussed the implementation issues with Torsten's proposal. Many of the issues relate to the processing needed for the event queue. Multiple counters might work as an alternative that may be sufficient for examples such as the Cholesky tasking implementation, and be easier to implement with network HW offload. This leaves the question of how to determine the number of counters, and when are they determined. Some applications may need an number of counters (or other notification objects) that is not known when a window is created.
Originally by jdinan on 2015-09-26 03:17:49 -0500
Attachment added: 09-2015 -- RMA Notified Access Implementation Discussion.pdf
(266.7 KiB)
Originally by jdinan on 2014-08-07 09:04:45 -0500
Problem Statement
In passive target mode, notifying the target that data has been transmitted is currently inefficient. It requires sending additional messages after operations that are to be notified have been remotely completed.
Proposed Solution 1: Sync-and-Notify, Window Counter
Addition of new "synchronize-and-notify" routines:
A notification counter is associated with the window, and is incremented at the target after the given passive target epoch has completed at the target (i.e. data is visible to the target process). Get, set, and wait functions are provided to enable a process to query the number of notifications it has received.
Criticism: Since the notification is separate from communication operations, e.g. put-and-notify, this can require two separate operations, which will not improve performance.
Proposed Solution 2: Op-and-Notify, Window Counter
Addition of new "communicate-and-notify" routines:
A notification counter is associated with the window, and is incremented at the target after the given RMA operation has completed at the target (i.e. data is visible to the target process). Get, set, and wait functions are provided to enable a process to query the number of notifications it has received.
Criticism: Only one counter per window.
Proposed Solution 3: Op-and-Notify, Matched Counter
Torsten's proposal. Details to come...
References
Reducing Synchronization Overhead Through Bundled Communication