MPI Accumulate operation restrictions

mpiforumbot commented 8 years ago

Originally by balaji on 2014-03-14 00:20:15 -0500

In MPI-3, the default value for the accumulate_ops info key is same_op_no_op. This means that two concurrent accumulate operations to the same target location using different operations is erroneous.

-Proposal:*

Make the default value of accumulate_ops to be empty or none, which would stand for concurrent accumulate operations to the same target with different ops are allowed. In this case, the MPI implementation might need to use mutexes to provide atomicity across all operations.
Provide info values, same_op, same_op_no_op, and same_op_no_op_replace, that allow the user to restrict the kind of concurrent accumulate operations that can happen at the target. The MPI implementation can utilize some of these hints to use hardware atomics rather than mutexes to optimize accumulate operations.

-Backward Compatibility:*

This proposal is backward compatible with both MPI-2 and MPI-3. It provides a more relaxed semantics compared to both (from the perspective of the user), but allows us to get the same efficiency given appropriate info key values.

-Impact on Implementations:*

Implementations will need to support the default case of allowing multiple concurrent accumulate with different ops at the same target, while maintaining atomicity.

mpiforumbot commented 8 years ago

Originally by jhammond on 2014-03-14 09:30:35 -0500

Relaxing the same-operation constraint by default doesn't preclude a hardware implementation since one can emulate any atomic with compare-and-swap, albeit very inefficiently in the contended case. If one does not have compare-and-swap in hardware, it seems reasonable to assume that the other operations won't be available in hardware either and thus the hardware-only scenario is not relevant.

mpiforumbot commented 8 years ago

Originally by rsthakur on 2014-03-14 11:03:41 -0500

As far as I know, MPI 2.2 allowed only same op. MPI 3.0 changed this by allowing same_op_no_op, which is the default unless the info key is changed to same_op (which gives MPI 2.2 behavior).

MPI 2.2, pg 365, ln 29-33, says

mpiforumbot commented 8 years ago

Originally by jhammond on 2014-06-25 11:00:21 -0500

FYI: This is related to #399

mpiforumbot commented 8 years ago

Originally by rsthakur on 2014-06-25 11:28:07 -0500

My comment above says there is no breakage from MPI-2 semantics, so the first paragraph of the ticket needs to be changed. The ticket is about relaxing further what was slightly relaxed in MPI-3.

mpiforumbot commented 8 years ago

Originally by gropp on 2014-12-10 13:00:07 -0600

The working group requests a compelling use case and a clear response to Rajeev's comment. Specifics are needed for both (1) the options provided and (2) the defaults.

mpiforumbot commented 8 years ago

Originally by jhammond on 2014-12-10 13:19:20 -0600

Replying to rsthakur:

My comment above says there is no breakage from MPI-2 semantics, so the first paragraph of the ticket needs to be changed. The ticket is about relaxing further what was slightly relaxed in MPI-3.

I removed this comment. This feature can still be justified by usage needs.

mpiforumbot commented 8 years ago

Originally by gropp on 2015-04-04 08:35:16 -0500

This ticket is consistent with the Forum's current approach to favor generality over performance, particularly in the defaults. I would add advice to users and implementers to that effect - users that they should specify the accumulate_ops as tightly as possible to remain backward compatible in terms of performance, and implementers to pay attention to this if their hardware is such that they can optimize for the special case of certain operations.

mpiforumbot commented 8 years ago

Originally by jhammond on 2015-04-04 11:34:30 -0500

There was a ticket to allow the user to specify very explicitly what ops and types were to be used (#399), but I withdrew it in favor of this one.

Do you think it is worth revisiting the more explicit ticket or is the input too cumbersome?

mpiforumbot commented 8 years ago

Originally by gropp on 2015-04-04 12:25:32 -0500

I don't think there is a need for that one yet. The issue, as I understood it from the original discussions, is that in some cases, if some of the operations have to be done in software, then they may all need to be done in software in order to ensure the semantics. Having fine grain control might allow the implementation to decide whether the hardware could handle them all, but without a clear example, I don't think it is worth adding at this time, especially since it could be added later.

mpiforumbot commented 8 years ago

Originally by rsthakur on 2015-06-03 13:48:15 -0500

From the June 2015 Forum meeting: Would like to see specific proposal text and discussion of potential performance issues.

mpi-forum / mpi-forum-historic

MPI Accumulate operation restrictions #416