mpi-forum / mpi-issues

Tickets for the MPI Forum
http://www.mpi-forum.org/
67 stars 7 forks source link

deprecate MPI_WIN_SYNC #40

Open jeffhammond opened 8 years ago

jeffhammond commented 8 years ago

Problem

MPI-3 shared-memory is semantically similar to threads in ways that make it difficult if not impossible to implementer synchronization through library calls. This is because compilers cannot see out-of-thin-air stores and may rearrange the code in ways that cause them to be incorrect but which are still compliant with the base language.

This is particularly true with Fortran.

Proposal

Since MPI_WIN_SYNC cannot be implemented without relying upon implementation-specific behavior, we will deprecate it and instead instruct users to rely upon the base language memory model and associated atomic operations and shared-memory synchronization features when using MPI-3 shared memory.

Changes to the Text

Describe the text changes here.

Impact on Implementations

Implementations do not have to try to implement a function that is arguably impossible to implement in a truly portable way.

Impact on Users

There some users of MPI_WIN_SYNC right now, but their codes are obviously not entrenched, so moving to the better alternatives should not be too difficult.

References

Threads Cannot be Implemented as a Library (see also http://dx.doi.org/10.1145/1064978.1065042)

RolfRabenseifner commented 8 years ago

The Fortran Problems are resolved with MPI-3.0 through the use of

     some-fortran-type, ASYNCHRONOUS :: shared-window-variable

As long as the accompanying compiler does not support Fortran TS 29113, then the MPI constant MPI_ASYNC_PROTECTS_NONBLOCKING must be .FALSE. and the user has to issue additional calls to MPI_F_SYNC_REG for the shared-window-variable. This is correctly described in MPI-3.1 Example 11.21 on pages 468-469.

MPI_WIN_SYNC is especially needed for Fortran and taught to many Fortran people in the past. Therefore, the precondition "This is particularly true with Fortran" of this ticket is completely wrong and the ticket should be withdrawn/closed.

jeffhammond commented 8 years ago

This ticket is not exclusive to Fortran, so regardless of how you feel about its impact on Fortran programs, I am not going to close it.

See my comments on https://github.com/mpi-forum/mpi-issues/issues/41 for why the specific comments about Fortran are invalid.

RolfRabenseifner commented 8 years ago

This ticket is extremely specific for Fortran user. C users have C11 functionality for memory fences available. Fortran user do not have. This ticket is clearly to prevent Fortran users from the use of MPI shared memory windows together with fast memory fences.

Since MPI_WIN_SYNC cannot be implemented without relying upon implementation-specific behavior, ...

When I understood correctly, then you expect some problems when implementing MPI_WIN_SYNC by using the C11 memory fence. Which problems do you see?

jeffhammond commented 8 years ago

See http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf for why MPI_WIN_SYNC is at best a kludge for Fortran.

RolfRabenseifner commented 8 years ago

Yes, Jeff, I know this text and exactly this is why MPI_F_SYNC_REG is only a workaround that works with all compilers because they have learnt that pthreads are used and that they should be careful about optimizations that hurt pthreads. It is a workround that works.

The TS 29113 approach with ASYNCHRONOUS buffer declaration is in principle the answer to that. It is significantly less intusive than volatile because all "normal" optimizations (that do not hurt pthreads) continue to be allowed.

The C problems are less significant, because all C compilers must expect that all library routines may access all memory and that pthreads are used.

Best regards Rolf

jeffhammond commented 8 years ago

The IBM compiler is allowed to generate thread-unsafe code when -qthreaded is not used (it can be implied by the user of compilers with the suffix _r).

The Intel compiler will properly implement Fortran 77 implicit SAVE semantics when OpenMP is not used, and these are thread-unsafe (as demonstrated when such code is called from an OpenMP parallel construct in C).

So here we have two examples where compilers are not automatically providing thread-safe code, so we should not assume this.

I get that ASYNCHRONOUS is the solution. My point is to try to make MPI-3 RMA work with Fortran codes written to standards shy of 2008+TS29113.