Open mpiforumbot opened 8 years ago
Originally by jdinan on 2013-07-12 15:43:24 -0500
Attachment added: Endpoints Proposal 3-13-13.pptx
(229.1 KiB)
Endpoints presentation from March, 2013 meeting
Originally by jdinan on 2013-07-15 14:16:13 -0500
Updates from 7/15 WG meeting. Cleaned up typos and added advice to users.
Originally by jdinan on 2013-07-15 15:22:28 -0500
Info hint moved to tt #381.
Added new error class for endpoints-related errors.
Originally by jdinan on 2013-07-29 16:01:49 -0500
Updates from 7/29 WG meeting.
Originally by jdinan on 2013-07-29 17:33:05 -0500
We discussed several potential mechanisms that would allow implementations to limit the number of endpoints:
Originally by jdinan on 2013-08-19 15:07:00 -0500
Integrated feedback from Rajeev.
Originally by jdinan on 2013-09-14 12:20:50 -0500
Marked attribute as "to be removed." The WG discussed this interface and identified a race between checking the attribute value and requesting endpoints. The preferred mechanism is to rely on MPI errors when endpoints communicator creation fails.
Originally by jdinan on 2013-10-03 08:29:34 -0500
Attachment added: Endpoints - EuroMPI 2013.pptx
(3709.5 KiB)
Slide from EuroMPI '13 presentation
Originally by jdinan on 2013-10-03 08:29:49 -0500
Attachment added: endpoints.pdf
(176.7 KiB)
EuroMPI '13 paper
Originally by jdinan on 2013-12-12 09:34:44 -0600
Attachment added: EP Plenary -- 12-11-2013.pptx
(744.9 KiB)
Endpoints plenary presentation - 12/11/2013
Originally by jdinan on 2014-02-10 12:28:05 -0600
Attachment added: mpi-report.pdf
(2699.9 KiB)
Formal proposal for ticket #380 (SVN located at trunk/working-groups/mpi31/ticket-380)
Originally by balaji on 2014-03-04 18:41:18 -0600
The Forum requested that we redo this ticket with the following changes:
Originally by balaji on 2014-03-04 22:46:37 -0600
Uploaded new pdf with fixes to Fortran voodoo, as suggested by JeffS.
Originally by jsquyres on 2014-03-05 08:47:57 -0600
New PDF looks good.
Originally by jdinan on 2014-03-05 11:27:41 -0600
Attachment added: mpi-report.4.pdf
(2700.3 KiB)
Updated to include missing change markers for the comm_compare advice to users.
Originally by dholmes on 2014-05-19 13:42:45 -0500
We need a way to resolve ambiguities introduces by having multiple end-points for situations where the MPI library must chose a single end-point from several that could "match". Two examples follow.
const int my_num_ep = 2; // NB: could be any non-negative integer
MPI_COMM parent, children[my_num_ep];
MPI_GROUP parentGroup, childGroup;
int ranks1[1], ranks2[1];
parent = <your-favourite-communicator>; // NB: could be an end-points communicator handle!
MPI_COMM_CREATE_ENDPOINTS(parent, my_num_ep, MPI_INFO_NULL, &children);
MPI_COMM_GROUP(parent, &parentGroup);
MPI_COMM_GROUP(children[0], &childGroup);
ranks1[0] = 0;
MPI_GROUP_TRANSLATE_RANKS(parentGroup, 1, ranks1, childGroup, &ranks2);_ ranks2[0] could take any value between 0 and (my_num_ep-1)
_ proposal 1: ranks2[0] should be set to MPI_PROC_NULL because there is no obvious correspondence_ proposal 2: ranks2[0] should be set to MPI_UNDEFINED because there are multiple correct answers
_ proposal 3: ranks2[0] should be set to MPI_AMBIGUOUS because there are multiple correct answers
// proposal 4: ranks2[0] should be set to the unique rank that retained the identity of the parent
const int my_num_ep = 2; // NB: could be any non-negative integer
MPI_COMM parent, children[my_num_ep];
MPI_GROUP parentGroup, childGroup;
int ranks1[1], ranks2[1];
parent = <your-favourite-communicator>; // NB: could be an end-points communicator handle!
MPI_COMM_CREATE_ENDPOINTS(parent, my_num_ep, MPI_INFO_NULL, &children);
MPI_COMM_GROUP(parent, &parentGroup);
MPI_COMM_GROUP(children[0], &childGroup);
MPI_GROUP_UNION(parentGroup, childGroup, unionGroup);_ unionGroup could contain 1 rank, my_num_ep ranks or (my_num_ep+1) ranks
_ is an end-point "the same" as its parent?_ the identity of a group member is not defined clearly enough to answer this sort of question
_ proposal 1: end-points all retain the identity of their parent despite introducing ambiguity_ proposal 2: end-points all have unique identities, which are distinct even to their parent's
_ proposal 3: one end-point retains the identity of its parent, the rest get unique identities
The simplest resolution to this seems to be to designate one of the new end-point communicator handles returned by each call to MPI_COMM_CREATE_ENDPOINTS as special, in that it retains the identity of the parent (possibly end-point) communicator handle.
Here is an initial suggestion for text to add to the MPI Standard as part of the end-points proposal. Between lines 8 and 9 on page 245:
The group associated with new_comm is a superset of the group associated with parent_comm. the communicator handle with an index of 0 in the new_comm array of handles represents the same group member as the parent_comm communicator handle. All communicator handles with an index in new_comm greater than 0 represent new group members.
Rationale: this defines unambiguous responses for operations that compare the constituents of groups, such as MPI_GROUP_TRANSLATE_RANKS, MPI_UNION and MPI_INTERCOMM_MERGE.
Originally by jdinan on 2014-05-19 20:55:40 -0500
Attachment added: mpi-report.5.pdf
(2699.8 KiB)
Ticket #380 proposal: Updated communicator comparison text
Originally by jdinan on 2014-08-11 11:12:28 -0500
Attachment added: mpi-report.6.pdf
(2699.6 KiB)
Ticket #380 proposal
Originally by jdinan on 2014-08-11 11:45:46 -0500
Attachment added: EP Plenary -- 05-2014.pptx
(497.2 KiB)
Endpoints plenary slides from June, 2014 meeting.
Originally by jdinan on 2014-08-11 11:46:32 -0500
Attachment added: EP Plenary -- 05-2014.pdf
(541.0 KiB)
Endpoints plenary slides from June, 2014 meeting. (PDF)
Originally by jdinan on 2014-08-11 12:20:29 -0500
Attachment added: mpi-report.7.pdf
(2699.6 KiB)
Updated proposal. Sentences were reordered to merge inter/intracommunicator text into the same paragraph and improve readability.
Originally by jdinan on 2014-09-02 17:05:13 -0500
Attachment added: mpi-report.8.pdf
(2699.6 KiB)
Updated proposal for vote at September, 2014 meeting
Originally by jdinan on 2014-11-17 11:30:17 -0600
Attachment added: mpi-report.9.pdf
(2699.6 KiB)
Updated with feedback from September '14 meeting: s/return error/raise an exception/
Originally by RolfRabenseifner on 2014-12-10 08:43:56 -0600
I would not put MPI_COMM_CREATE_ENDPOINTS into the middle of Section 6.4. It would be better to put it at the end of Section 6.4, but best would be to have a new Section 8.9 "Additional Endpoints", i.e., after the MPI "Startup" sections 8.7 and 8.8.
With such a new section, it is usual to write an intro-text.
Here my proposal:
With the startup methods such as mpiexec (see Section 8.8) or MPI_COMM_SPAWN, both together with MPI_INIT, each execution stream (abbreviated with OS process) is also one MPI process. A group or communicator is represented with a handle within each OS process, which represents information about a group of processes and a communication context, and additionally within each OS process, the information which rank in the group of processes is the associated own rank.
MPI_COMM_CREATE_ENDPOINTS can create within each OS process additional MPI processes. MPI_COMM_CREATE_ENDPOINTS does not start any additional application OS processes or threads. The calling MPI process together with the newly started MPI processes are named endpoints, and abbreviated as ranks. The created set of communicator handles represent the same communicator, which consistes of the whole set of endpoints defined by all processes within the group of a parent communicator, but each communicator handle represents another associated own rank.
These communicator handles can be used, for example, by several operating system threads to identify each thread with an own rank.
Additionally, I would add an example with MPI+OpenMP
Example 8.16 Using MPI_COMM_CREATE_ENDPOINTS together with OpenMP
MPI_Init_thread(NULL,NULL, MPI_TREAD_MULTIPLE, &provided);
if (provided < MPI_TREAD_MULTIPLE) MPI_Abort(....);
MPI_Comm_rank(MPI_COMM_WORLD, &my_parent_rank);
my_num_ep = ... /* for the output below, "my_parent_rank+2" is used */
MPI_Comm_create_endpoints(MPI_COMM_WORLD, my_num_ep, MPI_INFO_NULL, new_comm_handles);
#pragma omp parallel num_treads(my_num_ep)
{
thread_rank=omp_thread_num();
new_comm_index = thread_rank;
MPI_Comm_rank(new_comm_handles[new_comm_index], &my_rank);
printf("my_parent_rank=%d my_num_ep=%d new_comm_index=%d my_rank=%d (thread_rank=%d)\n",
my_parent_rank, my_num_ep, new_comm_index, my_rank, thread_rank);
}
If started on 3 processes with my_num_ep values 2, 3, 4 and sorted by my_rank the following output would be exepcted:
my_parent_rank=0 my_num_ep=2 new_comm_index=0 my_rank=0 (thread_rank=0) [[BR]] my_parent_rank=0 my_num_ep=2 new_comm_index=1 my_rank=1 (thread_rank=1) [[BR]] my_parent_rank=1 my_num_ep=3 new_comm_index=0 my_rank=2 (thread_rank=0) [[BR]] my_parent_rank=1 my_num_ep=3 new_comm_index=1 my_rank=3 (thread_rank=1) [[BR]] my_parent_rank=1 my_num_ep=3 new_comm_index=2 my_rank=4 (thread_rank=2) [[BR]] my_parent_rank=2 my_num_ep=4 new_comm_index=0 my_rank=5 (thread_rank=0) [[BR]] my_parent_rank=2 my_num_ep=4 new_comm_index=1 my_rank=6 (thread_rank=1) [[BR]] my_parent_rank=2 my_num_ep=4 new_comm_index=2 my_rank=7 (thread_rank=2) [[BR]] my_parent_rank=2 my_num_ep=4 new_comm_index=3 my_rank=8 (thread_rank=3)
The relation of my_parent_rank, my_num_ep, new_comm_index and my_rank is given through the definition of MPI_COMM_CREATE_ENDPOINTS. The relation of new_comm_index and thread_rank is defined through the application code in this example.
Originally by jdinan on 2014-12-15 11:08:31 -0600
Attachment added: EP Feedback - Dec 2014.pdf
(2997.8 KiB)
Formal proposal marked with feedback gathered during the Dec. 2014 meeting
Originally by jhammond on 2015-01-15 17:55:15 -0600
Replying to RolfRabenseifner:
With the startup methods such as mpiexec (see Section 8.8) or MPI_COMM_SPAWN, both together with MPI_INIT, each execution stream (abbreviated with OS process) is also one MPI process. A group or communicator is represented with a handle within each OS process, which represents information about a group of processes and a communication context, and additionally within each OS process, the information which rank in the group of processes is the associated own rank.
This is overly specific about a particular manner of implementation and I don't think it is appropriate to include in the standard.
MPI_COMM_CREATE_ENDPOINTS can create within each OS process additional MPI processes.
No. It creates MPI endpoints within an MPI process.
MPI_COMM_CREATE_ENDPOINTS does not start any additional application OS processes or threads.
This should be obvious and need not be said. In the pedantic limit, perhaps an "advice to users" can allude to it.
The calling MPI process together with the newly started MPI processes are named endpoints, and abbreviated as ranks.
Is the notion that processes are abbreviated as ranks present anywhere else in the standard? It is not necessary or appropriate to introduce it here.
The created set of communicator handles represent the same communicator, which consists of the whole set of endpoints defined by all processes within the group of a parent communicator, but each communicator handle represents another associated own rank.
I have no opinion on this text.
These communicator handles can be used, for example, by several operating system threads to identify each thread with an own rank.
This sort of implementation prescription is inappropriate except, perhaps, in any "advice to..."
Originally by jdinan on 2015-01-26 09:49:13 -0600
I don't share Jeff's negative opinion of these changes. I think there are aspects of MPI_COMM_CREATE_ENDPOINTS that are obvious to us today, but may not be clear to someone reading the specification ten years from now. There is certainly text like what Rolf suggested, that was added in MPI-1 and has been valuable to folks like me who became involved much later.
My only suggestion is that we could pursue this as a separate change/vote to the Forum, so that we can move ahead the main body of the endpoints proposal.
Replying to jhammond:
Replying to RolfRabenseifner:
With the startup methods such as mpiexec (see Section 8.8) or MPI_COMM_SPAWN, both together with MPI_INIT, each execution stream (abbreviated with OS process) is also one MPI process. A group or communicator is represented with a handle within each OS process, which represents information about a group of processes and a communication context, and additionally within each OS process, the information which rank in the group of processes is the associated own rank.
This is overly specific about a particular manner of implementation and I don't think it is appropriate to include in the standard.
Originally by jdinan on 2015-02-27 17:19:50 -0600
Attachment added: mpi-report.10.pdf
(2703.0 KiB)
Draft for feedback at March, 2015 meeting
Originally by dholmes on 2015-06-29 08:20:20 -0500
Attachment added: Outstanding issues with endpoints - May 2015.pdf
(248.4 KiB)
This ticket was migrated to: mpi-forum/mpi-issues#56
Originally by jdinan on 2013-07-12 15:20:28 -0500
Overview
This proposal introduces a new communicator creation function that can be used to create additional ranks, or endpoints, at an existing MPI process. These new endpoints behave the same as processes and can be associated with threads, allowing threads to fully participate in MPI operations. In contrast to this approach, ticket #288 proposed a static interface, where endpoints were generated when the MPI execution was launched.
Proposed New Function
See attached PDF for updated proposal.