Closed mpiforumbot closed 7 years ago
Originally by RolfRabenseifner on 2014-08-13 02:51:17 -0500
This Fortran ticket contains a/some part/s of #427 that was split off into tickets #440 - #446.
It should be accepted by the Fortran/
Language Binding chapter committee / WG before the Sep. 2014 meeting in
Japan or voted there with a single errata vote.
Originally by jsquyres on 2014-09-05 10:19:18 -0500
"They may, however, be used in blocking MPI operations." -- is that guaranteed to always be correct?
I.e., a compiler does not have to copy to a contiguous buffer that MPI can handle (in a blocking MPI call), right?
In practice, I think most (all?) compilers do this. But I'm not sure we should make a blanket statement like this.
Originally by RolfRabenseifner on 2014-09-05 11:06:21 -0500
Replying to jsquyres:
"They may, however, be used in blocking MPI operations." -- is that guaranteed to always be correct?
Yes, because a blocking operation is finished when the call returns. The compiler may copy the non-contiguous data into a contiguous scratch buffer prior to executing the routine and copies the scratch buffer back to the original memory when the routine returns and removes the scratch buffer. Blocking routines are therefore without problemn since MPI-1.0. By the way, the functionality of the nonblocking version when MPI_SUBARRAYS_SUPPORTED equals .TRUE. is defined exactly based on the outcome of the blocking version.
There is one exception: split-collective ..._START routines may block, but they are treated as nonblocking everywhere because the I/O operation is not yet finished. Ticket #451 makes this finally expicit.
Originally by jsquyres on 2014-09-05 11:46:24 -0500
No -- I wasn't asking about the existence of the buffer during the call. I was asking about whether the buffer was guaranteed to be contiguous, because compilers can copy to a contiguous buffer, but they don't have to.
Originally by RolfRabenseifner on 2014-09-06 00:46:38 -0500
A few things are coming together:
MPI-1.1 and later never disallowed the handover of strided subarrays as buffers or other Array arguments to blocking MPI routines.
If the MPI library is written without explicit interfaces (and inlining) then the compiler must do this copy because with implicit interfaces only "assumed size" arrays can be expressed.
Therefore, in blocking MPI routines, handing over a strided subarray was never a problem and was never forbidden by the MPI standard. It was never intended to change this.
I'll check, where the Fortran standard guarantees that strided arrays are allowed for assumed-size dummy arguments of implicit interfaces.
Originally by RolfRabenseifner on 2014-09-06 15:15:22 -0500
The question is, where in the Fortran 2008 Standard it is guaranteed that strided arrays (with triple or vector subsripts) can be an actual argument to assumed-size (i.e., declared with DIMENSION(*)) dummy arguments in an implicit Interface?
Section 12.5 of the [Fortran 2008] standard specifies the form of a procedure reference; in particular R1223 in that section specifies what the form of an actual argument must be. The base rules do not depend on whether the interface is explicit or implicit. Section 12.5.2 then has additional rules that assure correct argument correspondence (e.g. the actual corresponding to an ALLOCATABLE dummy must also be ALLOCATABLE etc.); again these rules do not depend on whether the interface is explicit or not.
So my answer to your first question is that it is permitted for a non-contiguous object (strided array or a vector-subscripted object) to appear as an actual argument corresponding to a dummy in an implicit interface procedure.
How is it "guaranteed" that the compiler must copy such a actual strided array into a contiguous scratch array prior to the call and the scratch data back to orignal actual argument in the ending of the call?
I think this guarantee is implied by the fact that an assumed size or explicit shape array is a simply contiguous array designator and therefore a contiguous object (6.5.4, esp. para 2). Therefore a corresponding non-contiguous actual must be compactified during argument association. Note that the standard does not specify how argument association is done, but it is hard for me to see how to get around doing this compactification.
As an aside: The converse situation of needing to avoid copy-in/out in certain situations is presently not really well defined. This applies to both ASYNCHRONOUS dummy arguments and argument association for coarrays. However, since the intent is reasonably clear I expect that these issues will be resolved in the next iteration of
If strided subarrays to blocking routines in mpif.h by a future compiler is not implemented with such scratch buffer copying together with call-by-reference to this srcatch buffer then such a compiler will not be an accompanying compiler to MPI according to the rules in MPI-3.0 Section 17.1.7.
Originally by jsquyres on 2014-12-05 09:13:46 -0600
I'd make one trivial change:
which also includes persistent request, and split collectives
Remove the comma (there's no need for a comma in a list of 2 things).
Originally by jsquyres on 2014-12-09 16:22:28 -0600
Fortran WG: Comment 9 is a ticket 0 change and can be applied after the ticket is voted in.
Originally by jsquyres on 2014-12-09 17:03:32 -0600
Fortran WG: the "or together with MPI_BOTTOM" phrase is meaningless. We removed it.
Originally by jhammond on 2014-12-09 17:10:06 -0600
"subscript triplets" is not standard Fortran terminology.
Originally by RolfRabenseifner on 2014-12-09 17:14:52 -0600
Replying to jhammond:
"subscript triplets" is not standard Fortran terminology.
The wording directly Comes from the Fortran Standard, e.g. Fortran2008_Final_Draft_international_Standard_N1830.pdf, e.g. on page 96
"The upper bound shall not be omitted from a subscript triplet in the last dimension."
Page 119:
"The rank of a part-ref that has a section subscript list is the number of subscript triplets and vector subscripts in the list."
Page 121:
"R621 subscript-triplet is [ subscript ] : [ subscript ] [ : stride ]"
Originally by jsquyres on 2014-12-11 14:32:58 -0600
Attachment added: mpi31-report-ticket-441.pdf
(2582.9 KiB)
PDF generated for review after adding the ticket text to the MPI doc
Originally by jsquyres on 2014-12-11 14:33:41 -0600
PDF containing the changes has been attached.
The new text is on page 626, lines 37-40.
Originally by longb on 2014-12-11 15:07:50 -0600
The new paragraph at page 626 lines 37-40 is correctly transcribed from the Description of this ticket. The content looks right.
Originally by RolfRabenseifner on 2014-12-16 05:27:48 -0600
PDF Review: all correctly transcribed, but at the meeting, we detected that the comma in page 626 line 40
"collectives), may"
is wrong and should be removed.
Originally by jsquyres on 2014-12-16 07:05:12 -0600
Comma removed.
Originally by RolfRabenseifner on 2014-08-12 10:29:41 -0500
Description
The additional information is for consistency reasons. Level is ticket-0.
History
Detected while checking the cross-references from Pt-to-Pt, 1-sided, and I/O chapter.
Extended Scope
None. (No need to add these changes to the erratas' document.)
Proposed Solution
-MPI-3.0 Section 17.1.12, page 627 line 22-42 reads*
If MPI_SUBARRAYS_SUPPORTED equals .FALSE.:
Implicit in MPI is the idea of a contiguous chunk of memory accessible through a linear address space. MPI copies data to and from this memory. An MPI program specifies the location of data by providing memory addresses and offsets. In the C language, sequence association rules plus pointers provide all the necessary low-level structure.
In Fortran, array data is not necessarily stored contiguously. For example, the array section A(1:N:2) involves only the elements of A with indices 1, 3, 5, . . . . The same is true for a pointer array whose target is such a section. Most compilers ensure that an array that is a dummy argument is held in contiguous memory if it is declared with an explicit shape (e.g., B(N)) or is of assumed size (e.g., B(*)). If necessary, they do this by making a copy of the array into contiguous memory.(1)
Because MPI dummy buffer arguments are assumed-size arrays if MPI_SUBARRAYS_SUPPORTED equals .FALSE., this leads to a serious problem for a nonblocking call: the compiler copies the temporary array back on return but MPI continues to copy data to the memory that held it. For example, consider the following code fragment:
real a(100) call MPI_IRECV(a(1:100:2), MPI_REAL, 50, ...)
-but should read*
If MPI_SUBARRAYS_SUPPORTED equals .FALSE.:
In this case, the use of Fortran arrays with subscript triplets as actual choice buffer arguments in any nonblocking MPI operation (which also includes persistent request,
and split collectives), may cause undefined behavior. They may, however, be used in blocking MPI operations.
Implicit in MPI is the idea of a contiguous chunk of memory accessible through a linear address space. MPI copies data to and from this memory. An MPI program specifies the location of data by providing memory addresses and offsets. In the C language, sequence association rules plus pointers provide all the necessary low-level structure.
In Fortran, array data is not necessarily stored contiguously. For example, the array section A(1:N:2) involves only the elements of A with indices 1, 3, 5, . . . . The same is true for a pointer array whose target is such a section. Most compilers ensure that an array that is a dummy argument is held in contiguous memory if it is declared with an explicit shape (e.g., B(N)) or is of assumed size (e.g., B(*)). If necessary, they do this by making a copy of the array into contiguous memory.(1)
Because MPI dummy buffer arguments are assumed-size arrays if MPI_SUBARRAYS_SUPPORTED equals .FALSE., this leads to a serious problem for a nonblocking call: the compiler copies the temporary array back on return but MPI continues to copy data to the memory that held it. For example, consider the following code fragment:
real a(100) call MPI_IRECV(a(1:100:2), MPI_REAL, 50, ...)
-Remark for the Fortran WG:* "may cause" is used instead of "will cause", because in the case of vector subscripts that are simply contiguous, there should not be a conflict.
Alternative Solutions
None.
Impact on Implementations
None required.
Impact on Applications / Users
None.
Entry for the Change Log
None.