mpi-forum / mpi-issues

Tickets for the MPI Forum
http://www.mpi-forum.org/
67 stars 7 forks source link

deprecate or fix MPI_COMM_JOIN #13

Open jeffhammond opened 8 years ago

jeffhammond commented 8 years ago

This was https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/301. Since there is no obvious working group for this, I am putting it here.

Background

At the October 2011 meeting (10/26/2011), Bill said that interaction with an explicit externally specified environment like POSIX was totally inappropriate for the MPI standard. On this basis, and the lack of demonstrated necessity for such a routine, we should deprecate MPI_COMM_JOIN.

MPI_COMM_JOIN explicitly refers to the externally specified protocol known as Berkeley Sockets. It is not compatible with other implementations of sockets due to the choice of type for the socket file descriptor (integer); for example, MPI_COMM_JOIN is incompatible with Windows Sockets (source: Fab Tillier). I do not see how a *nix-specific routine is no more inappropriate for the MPI standard than a POSIX-oriented MPI_FILE_STAT one.

Here is the relevant excerpt from http://www.lam-mpi.org/MailArchives/lam/2001/09/3315.php as to why this routine is probably superfluous (see paragraph 2):

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2001-09-21 08:24:50

> I am trying to understand the command MPI_Comm_join.

This is a fairly specialized function that was added to MPI-2 for whacky 
configurations that the MPI Forum couldn't predict. It is intended to 
take a file descriptor as an argument that represents another MPI process 
(where that file descriptor can be a pipe or a socket or some other IPC 
mechanism), and create a communicator between the two processes. You will 
need to create this file descriptor yourself -- it is intended to *only* 
be used by the MPI_Comm_join call. 

What do you need to use MPI_Comm_join for? You may wish to explore 
MPI_Comm_connect and MPI_Comm_accept instead -- they may provide an easier 
way to connect to previously unrelated MPI programs since there's no need 
for anything outside of the scope of MPI (i.e., a file descriptor). 

Proposal

We propose to redefine the current interface in such a way as to not break backwards compatibility on platforms that currently support this function while enabling those that currently cannot support it to do so in the future. Backwards compatibility on currently supporting platforms requires MPI_Socket to be defined to an integer file descriptor. On systems such as Windows that currently do not support this function, MPI_Socket can be defined to be the appropriate object without introducing a regression.

MPI_COMM_JOIN(fd, intercomm) 
   IN     fd                    socket file descriptor
   OUT    intercomm             new intercommunicator (handle)

int MPI_Comm_join(MPI_Socket fd, MPI_Comm *intercomm)

MPI_Comm_join(fd, intercomm, ierror) BIND(C)
    TYPE(MPI_Socket), INTENT(IN) ::  fd
    TYPE(MPI_Comm), INTENT(OUT) ::  intercomm
    INTEGER, OPTIONAL, INTENT(OUT) ::  ierror

MPI_COMM_JOIN(FD, INTERCOMM, IERROR)
    INTEGER FD, INTERCOMM, IERROR
Advice to implementers: In order to preserver backwards compatibility, 
MPI_Socket should be an integer file descriptor on platforms that currently 
support this function.  On platforms that do not support this function, there 
is no backwards compatibility issue, and MPI_Socket can be defined to be anything.
schulzm commented 6 years ago

June 2018 Meeting in Austin: forum decided to take this up again, for reading in BCN

tonyskjellum commented 6 years ago

We've decided to review, update, and read this ticket in Barcelona meeting.

jeffhammond commented 6 years ago

@tonyskjellum In favor of deprecation or fixing? I think fixing is the better path, since the only implementation that needs to do anything besides typedef int MPI_Socket is MS-MPI.

tonyskjellum commented 5 years ago

Hi, I'd like to get this for reading in December...

@schulzm : what is needed for a deprecation request and reading?

dholmes-epcc-ed-ac-uk commented 5 years ago

Thought (not fully formed): is MPI_Socket actually a port? In the sense of MPI_OPEN_PORT. That would harmonise connect/accept with join. It also suggests we should discover and clearly state what the differences are between these semantics, if any.

The implementation could still use a socket internally but just present a string name to the user.

The MPI_OPEN_PORT function takes an INFO object, so the user could hint that they would like the port to be backed up by a socket, and even provide an IP address for the intended target. OTOH, the user could use the INFO to assert that they will use the port in MPI_COMM_JOIN_X(port, intercomm). MPI can respond to that by failing to provide a port if join if not supported, or by creating a socket (if that is needed by its chosen implementation of join), or by preparing any other communication mechanism (if it can support join in a different/better way than via a socket).

Benefits: remove direct reliance on Berkley Sockets from the MPI Standard; harmonise the API for connect/accept and join; permit different/better implementations of join.

Problems: more work for implementors (many of whom are not convinced of the need for join in the first place); does not fix the existing API (creates a new API to avoid breaking all the non-existent things that use the current join API).

Difference to connect/accept: join implicitly uses MPI_COMM_SELF as the local group, whereas connect/accept use the communicator provided as an argument.

tonyskjellum commented 5 years ago

Ok we will read it in December

Anthony Skjellum, PhD 205-807-4968

On Jun 13, 2018, at 8:12 PM, Martin Schulz notifications@github.com wrote:

Assigned #13 to @tonyskjellum.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.

tonyskjellum commented 5 years ago

Let’s explore this :-)

Anthony Skjellum, PhD 205-807-4968

On Sep 26, 2018, at 6:51 AM, Dan Holmes notifications@github.com wrote:

Thought (not fully formed): is MPI_Socket actually a port? In the sense of MPI_OPEN_PORT. That would harmonise connect/accept with join. It also suggests we should discover and clearly state what the differences are between these semantics, if any.

The implementation could still use a socket internally but just present a string name to the user.

The MPI_OPEN_PORT function takes an INFO object, so the user could hint that they would like the port to be backed up by a socket, and even provide an IP address for the intended target. OTOH, the user could use the INFO to assert that they will use the port in MPI_COMM_JOIN_X(port, intercomm). MPI can respond to that by failing to provide a port if join if not supported, or by creating a socket (if that is needed by its chosen implementation of join), or by preparing any other communication mechanism (if it can support join in a different/better way than via a socket).

Benefits: remove direct reliance on Berkley Sockets from the MPI Standard; harmonise the API for connect/accept and join; permit different/better implementations of join.

Problems: more work for implementors (many of whom are not convinced of the need for join in the first place); does not fix the existing API (creates a new API to avoid breaking all the non-existent things that use the current join API).

Difference to connect/accept: join implicitly uses MPI_COMM_SELF as the local group, whereas connect/accept use the communicator provided as an argument.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

hppritcha commented 3 years ago

This topic was again discussed briefly in the 9/28/20 (v) MPI forum while reading mpi-forum/mpi-standard#269