request-based operations in MPI-3 RMA can be used to implement explicit-handle nonblocking ARMCI ops.
MPI-3 RMA separates local and remote completion now, just like ARMCI does. We should use flush_local to achieve the same.
ARMCI will benefit greatly from the use of win_lock_all and flush_local/flush/flushall-based completion.
MPI_Compare_and_swap may or may not be a good idea for ARMCI mutexes. (Jim will likely say that spinning across the network is bad, but if we know contention is going to be low...)
Reported by jhammond on 14 May 2013 15:51 UTC