MPI and UCC define INPLACE support differently for some operations. For example, for REDUCE_SCATTER (V):
At rank "I" UCC places the result of reduction at the rank's I offset = rcount i dt_size from the beginning of the buffer, while MPI puts the result at offset = 0 (directly in the beginning) for all ranks. Same for RSV.
NOTE, for Scatter, Scatterv this is not the case because MPI defines INPLACE in UCC style for those ops. Ie, the data at root is never moved for "self" in case of INPLACE.
So, we want to add support for MPI style INPLACE for RS and RSV collectives. Proposal (discussed long ago): add the coll_args flag UCC_COLL_ARGS_FLAG_RS_INPLACE_MPI.
Once support added, need to add the check for new API at ompi/coll/ucc/
MPI and UCC define INPLACE support differently for some operations. For example, for REDUCE_SCATTER (V): At rank "I" UCC places the result of reduction at the rank's I offset = rcount i dt_size from the beginning of the buffer, while MPI puts the result at offset = 0 (directly in the beginning) for all ranks. Same for RSV.
NOTE, for Scatter, Scatterv this is not the case because MPI defines INPLACE in UCC style for those ops. Ie, the data at root is never moved for "self" in case of INPLACE.
So, we want to add support for MPI style INPLACE for RS and RSV collectives. Proposal (discussed long ago): add the coll_args flag UCC_COLL_ARGS_FLAG_RS_INPLACE_MPI. Once support added, need to add the check for new API at ompi/coll/ucc/