Open jeffhammond opened 8 years ago
The Fortran Problems are resolved with MPI-3.0 through the use of
some-fortran-type, ASYNCHRONOUS :: shared-window-variable
As long as the accompanying compiler does not support Fortran TS 29113, then the MPI constant
MPI_ASYNC_PROTECTS_NONBLOCKING
must be .FALSE.
and the user has to issue additional calls to MPI_F_SYNC_REG
for the shared-window-variable. This is correctly described in MPI-3.1 Example 11.21 on pages 468-469.
MPI_WIN_SYNC
is especially needed for Fortran and taught to many Fortran people in the past.
Therefore, the precondition "This is particularly true with Fortran" of this ticket is completely wrong and the ticket should be withdrawn/closed.
This ticket is not exclusive to Fortran, so regardless of how you feel about its impact on Fortran programs, I am not going to close it.
See my comments on https://github.com/mpi-forum/mpi-issues/issues/41 for why the specific comments about Fortran are invalid.
This ticket is extremely specific for Fortran user. C users have C11 functionality for memory fences available. Fortran user do not have. This ticket is clearly to prevent Fortran users from the use of MPI shared memory windows together with fast memory fences.
Since MPI_WIN_SYNC cannot be implemented without relying upon implementation-specific behavior, ...
When I understood correctly, then you expect some problems when implementing MPI_WIN_SYNC by using the C11 memory fence. Which problems do you see?
See http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf for why MPI_WIN_SYNC
is at best a kludge for Fortran.
Yes, Jeff, I know this text and exactly this is why MPI_F_SYNC_REG is only a workaround that works with all compilers because they have learnt that pthreads are used and that they should be careful about optimizations that hurt pthreads. It is a workround that works.
The TS 29113 approach with ASYNCHRONOUS buffer declaration is in principle the answer to that. It is significantly less intusive than volatile because all "normal" optimizations (that do not hurt pthreads) continue to be allowed.
The C problems are less significant, because all C compilers must expect that all library routines may access all memory and that pthreads are used.
Best regards Rolf
The IBM compiler is allowed to generate thread-unsafe code when -qthreaded
is not used (it can be implied by the user of compilers with the suffix _r
).
The Intel compiler will properly implement Fortran 77 implicit SAVE
semantics when OpenMP is not used, and these are thread-unsafe (as demonstrated when such code is called from an OpenMP parallel construct in C).
So here we have two examples where compilers are not automatically providing thread-safe code, so we should not assume this.
I get that ASYNCHRONOUS
is the solution. My point is to try to make MPI-3 RMA work with Fortran codes written to standards shy of 2008+TS29113.
Problem
MPI-3 shared-memory is semantically similar to threads in ways that make it difficult if not impossible to implementer synchronization through library calls. This is because compilers cannot see out-of-thin-air stores and may rearrange the code in ways that cause them to be incorrect but which are still compliant with the base language.
This is particularly true with Fortran.
Proposal
Since
MPI_WIN_SYNC
cannot be implemented without relying upon implementation-specific behavior, we will deprecate it and instead instruct users to rely upon the base language memory model and associated atomic operations and shared-memory synchronization features when using MPI-3 shared memory.Changes to the Text
Describe the text changes here.
Impact on Implementations
Implementations do not have to try to implement a function that is arguably impossible to implement in a truly portable way.
Impact on Users
There some users of
MPI_WIN_SYNC
right now, but their codes are obviously not entrenched, so moving to the better alternatives should not be too difficult.References
Threads Cannot be Implemented as a Library (see also http://dx.doi.org/10.1145/1064978.1065042)