hzhou commented 3 years ago

Problem

It is not clear from the current text whether a heterogeneous datatype has restrictions, such as whether its displacements have to be multiple of the element size, or just multiple of alignment, or maybe it can be arbitrary. From implementation point of view, we certainly like some restrictions. For example, if we can assume displacements of multiple element size, then we can directly calculate element address, reaching optimum performance. Otherwise, we will be forced to add branches and worst, make temporary copies.

For example, can we MPI_TYPE_CREATE_HINDEXED_BLOCK(4, 1, {0, 20, 40, 60}, MPI_LONG_DOUBLE, &new_type)? Note that on typical 32-bit system, sizeof(long double) = 12 and alignof(long double) = 4.

Without restrictions, it seems even a displacement of {1, 21, 41, 61} should be allowed.

We internally had a debate and is currently having opposing views. We would like forum to clarify and give us a verdict.

Proposal

Clarify the text in MPI_TYPE_CREATE_HINDEXED, MPI_TYPE_CREATE_HINDEXED_BLOCK, MPI_TYPE_CREATE_HVECTOR, and MPI_TYPE_CREATE_STRUCT on any restrictions (or lack of) on the parameters. Add rationales if there is potential for controversy.

Changes to the Text

See Proposal

Impact on Implementations

Opportunity for better performances.

Impact on Users

For applications of basic datatype that sizeof(type) == alignof(type) and all user buffers are observing correct alignment, no impact. For applications exploiting mis-aligned data, they need be fixed. If we require displacement always be multiple of basic element size, some "legitimate" use cases may need work around.

References

None.

bosilca commented 3 years ago

Why would you need a specialized function for handling heterogeneous types if the displacements are always in multiple of the type size ? From my understanding the hetero support removes all restrictions on the alignment, allowing to define datatypes that would behave as packed struct in C.

hzhou commented 3 years ago

Why would you need a specialized function for handling heterogeneous types if the displacements are always in multiple of the type size ? From my understanding the hetero support removes all restrictions on the alignment, allowing to define datatypes that would behave as packed struct in C.

If displacements are always in multiple of the type size and always aligned, we may by-pass memcpy, for example, for potentially increased performance.

bosilca commented 3 years ago

That's not exactly my question. I was wondering if the displacement is always in multiple of the type size, why would you need the hetero MPI API (you could have created the datatype using the normal datatype API).

hzhou commented 3 years ago

That's not exactly my question. I was wondering if the displacement is always in multiple of the type size, why would you need the hetero MPI API (you could have created the datatype using the normal datatype API).

We are posing the question from the implementation (MPICH) point of view. We would like the MPI standard to pose restrictions on user even with the heterogeneous datatype so we can implement code with better performance. I guess you are raising the question that is it okay to let user assume/accept that a heterogeneous datatype will perform worse than its non-hetero counter part, right?

EDIT: let me acknowledge that you have provided one interpretation that --

a heterogeneous datatype should not have any (alignment) restrictions on its parameters.

EDIT: for completeness, the two additional views are --

The basic elements in a buf, count, datatype need always observe alignment posed by compiler, that is, we are allowed to directly access each element as its basic type.
All displacements of each basic element need be a multiple of the size of basic elements, so we can directly index the element if we pre-process the datatype.

pavanbalaji commented 3 years ago

I don't think @bosilca's question is fully answered in the comments above. Let me try to clarify.

I must admit I'm not sure what a heterogeneous datatype is. Is that another term for user-defined datatypes? I'll assume so for the rest of this discussion.

Let me simplify the question (we can get to the more complex question later). Is this a valid datatype:

MPI_TYPE_CREATE_HINDEXED_BLOCK(4, 1, {0, 17, 43, 97}, MPI_INT, &new_type)

The MPI standard does not seem to restrict this. But, allowing the user to create this datatype would restrict some of the optimizations that I can do inside the MPI implementation.

If the answer to this is "no, this is not allowed", then the more complex question can be asked. If the answer to this is "it is allowed", then the more complex question is irrelevant.

rsth commented 3 years ago

Heterogeneous refers to the "h" versions of the functions. MPI doesn't say anything other than the displacements are specified in bytes, so the above example should be valid.

pavanbalaji commented 3 years ago

MPI doesn't say anything other than the displacements are specified in bytes, so the above example should be valid.

That was what we were afraid of. Even though the user would likely never use such a type, supporting it means that we need to be conservative inside MPI, and thus lose performance.

wgropp commented 3 years ago

I don’t see that. The H types are for generality - for the case where byte displacements are needed. If such displacements are not required, then (for the most part) there are other datatype constructors that can be used, and optimized. In the cases where an H constructor needs to be used but there are still useful alignment properties, that might suggest a place for a new constructor.

Bill

William Gropp Director and Chief Scientist, NCSA Thomas M. Siebel Chair in Computer Science University of Illinois Urbana-Champaign

On Nov 6, 2020, at 5:55 PM, Pavan Balaji notifications@github.com wrote:

MPI doesn't say anything other than the displacements are specified in bytes, so the above example should be valid.

That was what we were afraid of. Even though the user would likely never use such a type, supporting it means that we need to be conservative inside MPI, and thus lose performance.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mpi-forum/mpi-issues/issues/328#issuecomment-723351208, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJFGZR6GWMWX2KGJLOQSFDSOSEFJANCNFSM4TNDPT3Q.

hzhou commented 3 years ago

Since it appears clear that there is no intended restrictions on H datatypes, I am closing this issue.

pavanbalaji commented 3 years ago

For completeness, I should point out that this issue is orthogonal to heterogeneous datatypes, and seems to be misleading the conversation. It's really what buffer the user wants us to access as what datatype. For example, is the below code correct?

char *buf = (char *) malloc(100);
int *buf2 = (int *) (void *) (buf + 3);
MPI_Send(buf2, 10, MPI_INT, ...);

Here, the user is asking us to treat buf2 as an array of integers, but it's not aligned to an integer boundary. I believe using a derived datatype with unaligned displacements would be equivalent to the above example.

hzhou commented 3 years ago

Alright, leave it open (for now) for the question https://github.com/mpi-forum/mpi-issues/issues/328#issuecomment-724336564.

mpi-forum / mpi-issues

Clarify restrictions on heterogeneous derived datatypes #328

Problem

Proposal

Changes to the Text

Impact on Implementations

Impact on Users

References