mpi-forum / mpi-issues

Tickets for the MPI Forum
http://www.mpi-forum.org/
67 stars 8 forks source link

clarify what is (and is not) duplicated by MPI_COMM_DUP #690

Closed jeffhammond closed 1 year ago

jeffhammond commented 1 year ago

Problem

During the March 2023 meeting, it was suggested that we clarify what exactly is and is not duplicated during MPI_Comm_dup. For example, we do not say that errhandlers are not copied - they are merely not stated as being copied.

Proposal

The chapter committee will make the text changes to clarify this situation.

Changes to the Text

Impact on Implementations

None.

Impact on Users

The standard is easier to understand.

References and Pull Requests

jeffhammond commented 1 year ago

@GuillaumeMercier @schulzm

GuillaumeMercier commented 1 year ago

Is this really a 4.1 item or a 5.0 one?

jeffhammond commented 1 year ago

It is a chapter committee change that can be made in 4.1 because it can be done without voting.

GuillaumeMercier commented 1 year ago

The current text explicity states that: " MPI_COMM_DUP duplicates the existing communicator comm with associated key values and topology information. For each key value, the respective copy callback function determines the attribute value associated with this key in the new communicator; one particular action that a copy callback may take is to delete the attribute from the new communicator. MPI_COMM_DUP returns in newcomm a new communicator with the same group or groups, same topology, and any copied cached information, but a new context [...]"

So this is pretty clear.

As for error handlers, the AtoU on page 462 (lines 3--6) states that: "A newly created communicator inherits the error handler that is associated with the “parent” communicator. In particular, the user can specify a “global” error handler for all communicators by associating this handler with the communicator MPI_COMM_WORLD immediately after initialization."

My reading is thus that error handlers are copied when MPI_COMM_DUP is invoked.

However, this is an AtoU and therefore non-binding text.

For the sake of clarity, I'm ok with explicitly adding a line in the description of MPI_COMM_DUP. But I think that the AtoU should not be an AtoU but normative text.

GuillaumeMercier commented 1 year ago

@jeffhammond @wgropp @tonyskjellum @pavanbalaji: I'd like your input on this matter.

jeffhammond commented 1 year ago

I have no opinion on this. I merely transcribed the issue to GitHub from the meeting I attended.

wgropp commented 1 year ago

I agree that error handlers should be copied when the communicator is duplicated.

GuillaumeMercier commented 1 year ago

How should we (the Chapter Committee) proceed? 1- Add a sentence to the description of MPI_COMM_DUP and/or 2- Promote the AtoU to normative text.

The second point doesn't seem to be a mere Chapter Committee Change to me, though.

wgropp commented 1 year ago

I'd add a sentence to the description of MPI_COMM_DUP.

bosilca commented 1 year ago

This sentence should also cover MPI_COMM_DUP_WITH_INFO.

wgropp commented 1 year ago

Agreed.

GuillaumeMercier commented 1 year ago

On second thoughts, I disagree because of this sentence:

"MPI_COMM_DUP_WITH_INFO behaves exactly as MPI_COMM_DUP except that the hints provided by the argument info are associated with the output communicator newcomm."

Therefore, there is no need to add the sentence about error handlers. More precisely, if we add this sentence, then why not writing something about topology, attributes, etc. too?

Thus I advocate to leave the text for MPI_COMM_DUP_WITH_INFO as it is now.

wgropp commented 1 year ago

Ah, good point. I prefer this (not adding more text) - all too often, we've added unnecessary text that duplicates information elsewhere in the standard.