mpi-forum / mpi-forum-historic

Migration of old MPI Forum Trac Tickets to GitHub. New issues belong on mpi-forum/mpi-issues.
http://www.mpi-forum.org
2 stars 3 forks source link

MPI_Aint addressing arithmetic #349

Closed mpiforumbot closed 8 years ago

mpiforumbot commented 8 years ago

Originally by jdinan on 2012-12-03 09:23:39 -0600


This ticket is part of a series: #349, #402, #404 *

- Note: This ticket is now stale. Please refer to the PDF formal proposal attached to #349. **

Problem Statement

MPI_Aint values are used to represent addresses, which are unsigned quantities. However, MPI_Aint is a signed integer type. Thus, arithmetic on MPI_Aint values can overflow, resulting in incorrect address values. The situation is exacerbated by Fortran's lack of unsigned integer types, and non-zero values of MPI_BOTTOM in Fortran that vary across processes, as permitted by the standard.

Impact

MPI Dynamic Window Displacements:

Windows displacements are of type MPI_Aint, which is a signed integer. Dynamic windows use memory addresses for displacements, which are unsigned integers that can overflow MPI_Aints when used for displacement arithmetic. Because signed integer representations are not standardized across C and Fortran, we need to provide a "safe" mechanism for performing arithmetic on dynamic window displacements. For example, the following code is potentially incorrect and definitely not portable:

MPI_Aint disp = &array[0];

disp++; /* This can overflow */

MPI_Get(..., disp);

Datatypes:

The standard currently provides no safe mechanism for performing arithmetic on addresses stored as MPI_Aints when building h-indexed datatypes.

Possible Solutions

Solution 1:

MPI_Get_address can already be used to safely convert and address to an MPI_Aint. We could add a function to convert an Aint to an address:

int MPI_Aint_to_address(MPI_Aint aint_addr, void **addr);

The downside to this approach is that it is not safe for heterogeneous platforms, where the size of a pointer is not the same at all processes. It also will not address the problem for Fortran users.

Solution 2:

Add a new type that supports displacement arithmetic, called MPI_Disp, which is an unsigned integer type that can hold an MPI_Aint address. We would also add functions to provide portable conversion:

int MPI_Get_disp(const void *location, MPI_Disp *disp);
int MPI_Aint_to_disp(MPI_Aint aint, MPI_Disp *disp);
int MPI_Disp_to_aint(MPI_Disp disp, MPI_Aint *aint);

This solution will not work for Fortran, which does not support unsigned integers.

Solution 3:

Add an Aint arithmetic routine, that is allowed to be implemented as a macro:

MPI_Aint MPI_Aint_add(MPI_Aint base, MPI_Aint disp)

This function produces a new MPI_Aint value that is equivalent to the sum of the base and disp arguments. The value of base may be relative to a non-zero value of MPI_BOTTOM that is unknown at the process performing the call to MPI_Aint_add. The addition is performed in a manner that results in the correct MPI_Aint representation of the address, as if the process that originally produced base had called:

MPI_Get_address((char *) base + disp, &out_addr)

-This solution will work for C and Fortran, and is the proposed solution.*

Proposal Text

The following text proposes incorporating Solution 3 into the standard. See also #402, which applies this solution to clean up the linked list RMA example.

Sec. 4.1.5

MPI_Aint MPI_Aint_add(MPI_Aint base, MPI_Aint disp)

INTEGER(KIND=MPI_ADDRESS_KIND) MPI_Aint_add(base, disp)
   INTEGER(KIND=MPI_ADDRESS_KIND), INTENT(IN) :: base, disp

INTEGER(KIND=MPI_ADDRESS_KIND) MPI_AINT_ADD(BASE, DISP)
   INTEGER(KIND=MPI_ADDRESS_KIND) BASE, DISP

This function produces a new MPI_Aint value that is equivalent to the sum of the base and disp arguments, where base represents a base address returned by a call to MPI_GET_ADDRESS and disp represents a signed integer displacement. The resulting address is valid only at the process that generated base, and it must correspond to a location in the same object referenced by base, as described in Section 4.1.2. The addition is performed in a manner that results in the correct MPI_Aint representation of the output address, as if the process that originally produced base had called:

MPI_Get_address((char *) base + disp, &result)

Rationale: MPI_Aint values are signed integers, while addresses are unsigned quantities. Direct arithmetic on addresses in MPI_Aint variables can cause overflows, resulting in undefined behavior.

Sec. 2.6.4

Add MPI_AINT_ADD to the list of functions that can be implemented as macros.

Change Log Entries

Section 4.1.5 on page 103.

The _MPI_AINTADD and _MPI_AINTDIFF routines were added to provide a portable mechanism for manipulating addresses stored in MPI address integer types.

Section 4.1.14 on pages 126 and 130, and Example 11.12 on pages 470 and 472.

Examples were updated to use the new _MPI_AINTADD and _MPI_AINTDIFF routines.

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-03-12 15:39:29 -0500


_Comment on MPIDisp approach, from Hubert Ritzdorf:

concerning Ticket 349. (_) In Fortran, the displacement for dynamic windows may be signed since MPIBOTTOM of Fortran is not (void *) NULL.. () Fortran doesn't have unsigned integers, thus you cannot use a Fortran version of MPI_Disp for arithmetics (such as disp++). (*) MPI Displacements in general may be signed. They might be relative to any address.

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-03-12 15:42:56 -0500


Note from discussion on MPI_Aint_index: We must point out that the first argument will be treated as an address (unsigned) and the second will be treated as a signed displacement.

WG consensus from 3/12/2013 is to pursue option #3 (MPI_Aint_index()) above.

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-07-17 09:48:16 -0500


Pointer from David Goodell:

Might need references between the new text and this existing section.

MPI 3.0, pages 115-116

4.1.12 Correct Use of Addresses

Successively declared variables in C or Fortran are not necessarily stored at contiguous locations. Thus, care must be exercised that displacements do not cross from one variable to another. Also, in machines with a segmented address space, addresses are not unique and address arithmetic has some peculiar properties. Thus, the use of addresses, that is, displacements relative to the start address MPI_BOTTOM, has to be restricted. Variables belong to the same sequential storage if they belong to the same array, to the same COMMON block in Fortran, or to the same structure in C. Valid addresses are defined recursively as follows:

  1. The function MPI_GET_ADDRESS returns a valid address, when passed as argument a variable of the calling program.
  2. The buf argument of a communication function evaluates to a valid address, when passed as argument a variable of the calling program.
  3. If v is a valid address, and i is an integer, then v+i is a valid address, provided v and v+i are in the same sequential storage. A correct program uses only valid addresses to identify the locations of entries in communication buffers. Furthermore, if u and v are two valid addresses, then the (integer) difference u - v can be computed only if both u and v are in the same sequential storage. No other arithmetic operations can be meaningfully executed on addresses.

The rules above impose no constraints on the use of derived datatypes, as long as they are used to define a communication buffer that is wholly contained within the same sequential storage. However, the construction of a communication buffer that contains variables that are not within the same sequential storage must obey certain restrictions. Basically, a communication buffer with variables that are not within the same sequential storage can be used only by specifying in the communication call buf # MPI_BOTTOM, count1, and using a datatype argument where all displacements are valid (absolute) addresses.

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-07-17 10:02:58 -0500


Attachment added: ticket_349.pptx (60.3 KiB) June, 2013 presentation to the Forum

mpiforumbot commented 8 years ago

Originally by RolfRabenseifner on 2013-12-04 16:53:36 -0600


I comment only on the proposed solution 3:

As I understand, you differentiate between an

and with the following properties:

You expect, that

But if this is valid, then I would expect that also

This means, all the examples with relative byte displacements (e.g., as input for MPI_TYPE_CREATE_STRUCT) and all the existing user applications with such differences "addr2-addr1" are invalid software.

Is this analysis correct?

mpiforumbot commented 8 years ago

Originally by RolfRabenseifner on 2013-12-04 17:13:21 -0600


General question:

If a system uses internal unsigned addresses from 0 to 2**64-1 and MPI_BOTTOM is defined as 2**63, then the signed addresses returned from MPI_GET_ADDRESS will range between -2**63 and 2**63-1 which should be the range of a 8 byte signed integer.

All operations + - * with useful results should work those MPI_Aint addresses and displacements.

All existing examples and correct applications codes (i.e. using MPI_GET_ADDRESS and not the & operator) are still valid.

The only remaining question seems to be, whether the Routine proposed in solution 1 is really needed in application use-cases?

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-12-05 10:28:33 -0600


Replying to RolfRabenseifner:

I comment only on the proposed solution 3:

As I understand, you differentiate between an

  • "MPI_Aint address" and an
  • "MPI_Aint displacement",

and with the following properties:

  • "address" seems to be an unsigned value stored somehow in an signed integer;
  • "displacement" is a normal signed address-sized integer, and therefore
  • normal language operations + - * / apply for such displacements.

You expect, that

  • address2 := address1 + displacement cannot be done with the normal language "+" operator.

But if this is valid, then I would expect that also

  • displacement := address2 - address1
    cannot be done with the normal language "-" operator.

This means, all the examples with relative byte displacements (e.g., as input for MPI_TYPE_CREATE_STRUCT) and all the existing user applications with such differences "addr2-addr1" are invalid software.

Is this analysis correct?

The arithmetic is unsafe/nonportable when done on an address that is stored in an MPI_Aint. Some of the datatypes examples are doing arithmetic on the pointer before converting it to an MPI_Aint, which is ok. Some are definitely unsafe -- for example, pg. 126 line 4. I started ticket #402 to update the examples to use MPI_Aint_add. I would be grateful for any help with finding and fixing the bugs in the examples!

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-12-05 10:34:49 -0600


Replying to RolfRabenseifner:

General question:

If a system uses internal unsigned addresses from 0 to 2**64-1 and MPI_BOTTOM is defined as 2**63, then the signed addresses returned from MPI_GET_ADDRESS will range between -2**63 and 2**63-1 which should be the range of a 8 byte signed integer.

All operations + - * with useful results should work those MPI_Aint addresses and displacements.

All existing examples and correct applications codes (i.e. using MPI_GET_ADDRESS and not the & operator) are still valid.

The only remaining question seems to be, whether the Routine proposed in solution 1 is really needed in application use-cases?

The issue with this approach is that programming languages don't define a particular integer or address representation. Most systems use two's complement and a flat (non-segmented) address space, which allows the datatypes/RMA examples and existing applications to work fine on those systems. However, MPI needs to provide an interface that allows applications to be portable to environments that don't provide these semantics.

mpiforumbot commented 8 years ago

Originally by jdinan on 2013-12-11 11:30:25 -0600


Feedback from 11/12 reading: removed "BIND(C)", replaced "out_addr" with "result_addr"

Squyres is going to look into moving from a Fortan subroutine to a Fortran function. This would allow the function to be used in an expression, which is generally desirable.

mpiforumbot commented 8 years ago

Originally by jsquyres on 2013-12-12 06:16:10 -0600


Jim --

Per conversation in the room yesterday, I asked the Fortran WG about the Fortran bindings for this proposal. Our consensus is that these bindings should be FUNCTIONs, not SUBROUTINEs.

Therefore, the bindings should be:

INTEGER(KIND=MPI_ADDRESS_KIND) MPI_Aint_add(base, disp)
   INTEGER(KIND=MPI_ADDRESS_KIND), INTENT(IN) :: base, disp

INTEGER(KIND=MPI_ADDRESS_KIND) MPI_AINT_ADD(BASE, DISP)
   INTEGER(KIND=MPI_ADDRESS_KIND) BASE, DISP
mpiforumbot commented 8 years ago

Originally by jdinan on 2013-12-12 10:08:25 -0600


Updated the Fortran binding from a subroutine to a function, based on input from the Fortran WG.

mpiforumbot commented 8 years ago

Originally by jdinan on 2014-02-05 15:33:13 -0600


Attachment added: diff-349-402-404.pdf (9.2 KiB) Latex diff against r1731, for tickets #349, #402, and #404

mpiforumbot commented 8 years ago

Originally by jdinan on 2014-02-10 12:37:52 -0600


Attachment added: review-349-402-404.pdf (2698.5 KiB) Formal proposal for tickets #349, #402, and #404

mpiforumbot commented 8 years ago

Originally by jsquyres on 2014-05-14 13:24:33 -0500


Jim: the wording on the original 349 text for MPI_AINT_ADD:

...The resulting address is valid only at the process that generated base...

Means that it is valid to do something like this:

#!c
  void *foo = malloc(...);
  MPI_Aint a, b;
  MPI_Address(foo, &a);
  MPI_Send(&a, 1, MPI_AINT, ...);
  /* Receiver computes MPI_Aint_add(a, some_aint_offset) and sends back the result */
  MPI_Recv(&a, 1, MPI_AINT, ...);

Specifically, the text says that the result is valid only in the process that generated the base value. It does not say that the computation has to be performed in the same process that generated the base value.

I'm pretty sure that this is not what you want.

The same is true for MPI_AINT_DIFF.

mpiforumbot commented 8 years ago

Originally by jdinan on 2014-05-19 09:50:58 -0500


Replying to jsquyres:

Jim: the wording on the original 349 text for MPI_AINT_ADD:

...The resulting address is valid only at the process that generated base...

Specifically, the text says that the result is valid only in the process that generated the base value. It does not say that the computation has to be performed in the same process that generated the base value.

I'm pretty sure that this is not what you want.

The same is true for MPI_AINT_DIFF.

This is actually what we want. We need this semantic for dynamic windows, where the origin process calculates the target displacement (which is an address relative to the target's MPI_BOTTOM) or creates a datatype using MPI_Aint values provided by the target, and uses it to access data in the target process through an one-sided operation.

mpiforumbot commented 8 years ago

Originally by jsquyres on 2014-05-19 10:08:48 -0500


Ok; thanks for the clarification.

mpiforumbot commented 8 years ago

Originally by jsquyres on 2014-06-05 10:34:18 -0500


Ticket passed 1st vote: 16 yes, 0 no, 3 abstain

mpiforumbot commented 8 years ago

Originally by bosilca on 2015-02-03 12:59:08 -0600


r1913 in the golden copy.

mpiforumbot commented 8 years ago

Originally by RolfRabenseifner on 2015-02-10 13:57:22 -0600


349 + #402 + #404 + #421 also done in Chap.2 Terms and Annex B Changelog --> Waiting for pdf review

mpiforumbot commented 8 years ago

Originally by gropp on 2015-02-12 13:33:25 -0600


Committed to one sided chapter in r1963