Open hjelmn opened 6 years ago
I should add that I don't know if this is the right approach in the long run. lock_all, flush_all, etc might be sufficient. This proposal is up to discuss the relative merit of making the neighborhood locking explicit.
@hjelmn
While I agree with the proposal in spirit, I wonder if this can be done without additional function calls. For example, flush_neighbors
is really a performance optimization compared with flush_all
. So, can an info argument to tell the MPI implementation that I will only communicate with my neighbors not be sufficient? Then all the existing functions can be directly used, and an MPI implementation that respects the new info argument would simply perform better.
@pavanbalaji I have exactly what you suggest in a prototype. This proposal was suggested as an alternative to an info key by another person at the forum so I thought I would put it up for discussion.
I will open the proposal for the info key as well. I don't know if the info key is better placed on the window or on the topology communicator creation. We are proposing to add MPI_Cart_create_with_info
as there was some interest in having a similar optimization available for two-sided. This came out of the discussion as to why a topology communicator is allowed to communicate with any process in the original communicator. I would like to fix it so MPI_Send
, MPI_Put
, etc to a process that is not a neighbor is disallowed if the info key is set on the communicator itself.
Problem
The motivation for this proposal is to give implementation opportunities to optimize for cases where the communicator used to create the window has an attached topology. This proposal adds passive-target synchronization methods for targeting just the local neighborhood. In the case of MPI_Graph (soon to be deprecated 🤞) and MPI_Cart the lock will be taken on all neighbors. In the case of MPI_Dist_graph the lock will be taken on any neighbor that was specified in the destinations array.
I see this proposal as a replacement for the Post-Start-Complete-Wait synchronization method. It is not dynamic like PSCW but there may be ways we can work around that.
I should note, additionally, that the end goal of some of the work I am doing in the coll working group will add support for limiting the peers that can be communicated with on a topology communicator. If that succeeds it will allow for additional optimization.
Proposal
Introduce the following functions:
Changes to the Text
Add the above functions and their descriptions.
Impact on Implementations
Implementations need to implement these functions.
Impact on Users
None. These are new functions that would be added to the standard.