mpiwg-large-count / large-count-issues

Issues related to large count features in MPI.
MIT License
0 stars 0 forks source link

New quad precision predefined data types #3

Open jeffhammond opened 8 years ago

jeffhammond commented 8 years ago

This was https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/318.

Description

GCC has added the __float128 data type (using libquadmath) and Intel defines _Quad, both of which implement IEEE-754 2008 binary128 "quad precision". These types complement the MPI_REAL16 and MPI_COMPLEX32 Fortran types (added to MPI-2.2 by ticket #64). These quad precision types can already be used (on homogeneous architectures) with point-to-point operations by users defining datatypes and custom MPI_Ops, but one-sided operations such as MPI_Accumulate require predefined types. This proposal would add the following optional quad precision types.

MPI Datatype C type
MPI_QUAD __float128 / _Quad
MPI_C_QUAD_COMPLEX __float128 _Complex / _Quad _Complex
MPI_QUAD_INT struct {__float128 var; int loc;}

This is a correction to the standard, because the MPI standard should define predefined datatype names that correspond to the current set of standardized and used types in the accompanying compiler (e.g., gcc).

Because other optional datatypes are not available for MPI_MAXLOC and MPI_MINLOC, the proposal of MPI_QUAD_INT is directly implied. The big difference is, that in Fortran, there is a standardized way via '''selected_real_kind''' to specify 16 byte REAL. In C, this standardized method is missing, and users have to use optionally provided methods like {{{__}}}float128 and _Quad. Therefore, Part 2 provides the needed datatypes MPI_QUAD_INT and MPI_FLOAT128_INT.

History

Extended Scope

None.

Proposed Solution, Part 1

'''MPI-2.2 Annex A.1.1, page 517, lines 25-29 read'''

| Optional datatypes (Fortran) | Fortran types | |C type: MPI_Datatype | C++ type: MPI::Datatype | | |Fortran type: INTEGER | | | |MPI_DOUBLE_COMPLEX | MPI::F_DOUBLE_COMPLEX | DOUBLE COMPLEX | |MPI_INTEGER1 | MPI::INTEGER1 | INTEGER*1 | | ... | ... | ... |

'''but should read'''

||||= Optional datatypes (C/C++) =||= C/C++ types =|| ||C type: MPI_Datatype ||C++ type: MPI::Datatype || || ||Fortran type: INTEGER || || || ||MPI_QUAD ||(use C datatype handle) || _Quad || ||MPI_C_QUAD_COMPLEX ||(use C datatype handle) || _Quad _Complex || ||MPI_FLOAT128 ||(use C datatype handle) || {{{}}}float128 || ||__MPI_C_FLOAT128_COMPLEX ||(use C datatype handle) || {{{__}}}float128 _Complex ||

||||= Optional datatypes (Fortran) =||= Fortran types =|| ||C type: MPI_Datatype ||C++ type: MPI::Datatype || || ||Fortran type: INTEGER || || || ||MPI_DOUBLE_COMPLEX ||MPI::F_DOUBLE_COMPLEX ||DOUBLE COMPLEX || ||MPI_INTEGER1 ||MPI::INTEGER1 ||INTEGER*1 || ||... ||... ||... ||

'''MPI-2.2 Section 3.2.2, page 27, lines 33-39 read'''

MPI requires support of these datatypes, which match the basic datatypes of Fortran and ISO C. Additional MPI datatypes should be provided if the host language has additional data types: MPI_DOUBLE_COMPLEX for double precision complex in Fortran declared to be of type DOUBLE COMPLEX; MPI_REAL2, MPI_REAL4 and MPI_REAL8 for Fortran reals, declared to be of type REAL_2, REAL_4 and REAL_8, respectively; MPI_INTEGER1 MPI_INTEGER2 and MPI_INTEGER4 for Fortran integers, declared to be of type INTEGER_1, INTEGER_2 and INTEGER_4, respectively; etc.

'''but should read'''

MPI requires support of these datatypes, which match the basic datatypes of Fortran and ISO C. Additional MPI datatypes should be provided if the host language has additional data types: MPI_DOUBLE_COMPLEX for double precision complex in Fortran declared to be of type DOUBLE COMPLEX; MPI_REAL2, MPI_REAL4 and MPI_REAL8 for Fortran reals, declared to be of type REAL_2, REAL_4 and REAL_8, respectively; MPI_INTEGER1 MPI_INTEGER2 and MPI_INTEGER4 for Fortran integers, declared to be of type INTEGER_1, INTEGER_2 and INTEGER_4, respectively; etc. A complete list of such MPI datatypes corresponding to optional datatypes in the hosting languages is provided in Annex A.1.1 in the tables starting on page 517.

'''MPI-2.2 Section 5.9.2, page 165, lines 39-45 read'''

||Floating point: ||MPI_FLOAT, MPI_DOUBLE, MPI_REAL, || || ||MPI_DOUBLE_PRECISION || || ||MPI_LONG_DOUBLE || || ||and handles returned from || || ||MPI_TYPE_CREATE_F90_REAL, || || ||and if available: MPI_REAL2, || || ||MPI_REAL4, MPI_REAL8, MPI_REAL16 ||

'''but should read'''

||Floating point: ||MPI_FLOAT, MPI_DOUBLE, MPI_REAL, || || ||MPI_DOUBLE_PRECISION || || ||MPI_LONG_DOUBLE || || ||and handles returned from || || ||MPI_TYPE_CREATE_F90_REAL, || || ||and if available: MPI_REAL2, || || ||MPI_REAL4, MPI_REAL8, MPI_REAL16, || || ||MPI_QUAD, MPI_FLOAT128 ||

'''MPI-2.2 Section 5.9.2, page 165, lines 47 - page 166, line 7 read'''

||Complex: ||MPI_COMPLEX, || || ||MPI_C_FLOAT_COMPLEX, || || ||MPI_C_DOUBLE_COMPLEX, || || ||MPI_C_LONG_DOUBLE_COMPLEX, || || ||and handles returned from || || ||MPI_TYPE_CREATE_F90_COMPLEX, || || ||and if available: MPI_DOUBLE_COMPLEX, || || ||MPI_COMPLEX4, MPI_COMPLEX8, || || ||MPI_COMPLEX16, MPI_COMPLEX32 ||

'''but should read'''

||Complex: ||MPI_COMPLEX, || || ||MPI_C_FLOAT_COMPLEX, || || ||MPI_C_DOUBLE_COMPLEX, || || ||MPI_C_LONG_DOUBLE_COMPLEX, || || ||and handles returned from || || ||MPI_TYPE_CREATE_F90_COMPLEX, || || ||and if available: MPI_DOUBLE_COMPLEX, || || ||MPI_COMPLEX4, MPI_COMPLEX8, || || ||MPI_COMPLEX16, MPI_COMPLEX32, || || ||MPI_C_QUAD_COMPLEX, || || ||MPI_C_FLOAT128_COMPLEX ||

'''MPI-2.2 Section 13.5.2, page 433, in the right column, the following optional types should be added:'''

|| || || ||MPI_QUAD || 16|| ||MPI_FLOAT128 || 16|| ||MPI_C_QUAD_COMPLEX || 2_16|| ||MPI_C_FLOAT128_COMPLEX || 2_16||

Proposed Solution, Part 2

'''MPI-2.2 Section 3.2.2, page 27, lines 33-35 read and are not changed[[BR]](it is here cited because it is reused for the new sentence in Section 5.9.4):'''

MPI requires support of these datatypes, which match the basic datatypes of Fortran and ISO C. '''Additional MPI datatypes should be provided if the host language has additional data types''': MPI_DOUBLE_COMPLEX for double precision complex in Fortran declared to be of type DOUBLE COMPLEX;

'''MPI-2.2 Section 5.9.4, page 168, lines 29-32 read'''

In order to use MPI_MINLOC and MPI_MAXLOC in a reduce operation, one must provide a datatype argument that represents a pair (value and index). MPI provides nine such predefined datatypes. The operations MPI_MAXLOC and MPI_MINLOC can be used with each of the following datatypes.

'''but should read'''

In order to use MPI_MINLOC and MPI_MAXLOC in a reduce operation, one must provide a datatype argument that represents a pair (value and index). For this, MPI provides nine such predefined datatypes. The operations MPI_MAXLOC and MPI_MINLOC can be used with each of the following datatypes.

'''MPI-2.2 Section 5.9.4, page 168, line 48 reads'''

||MPI_LONG_DOUBLE_INT ||{{{long double}}} and {{{int}}} ||

'''but should read (based on sentence page 27)'''

||MPI_LONG_DOUBLE_INT ||{{{long double}}} and {{{int}}} || ||MPI_QUAD_INT (optional) ||{{{_Quad}}} and {{{int}}} || ||MPI_FLOAT128_INT (optional) ||{{{float128}}} __and {{{int}}} ||

The optional predefined datatypes should be provided if the host language supports the corresponding data type.

'''MPI-2.2 Annex A.1.1, page 518, lines 1-10 read'''

||||= Datatypes for reduction functions (C and C++) =|| ||C type: MPI_Datatype ||C++ type: MPI::Datatype || ||Fortran type: INTEGER || || ||MPI_FLOAT_INT ||MPI::FLOAT_INT || ||... ||... || ||MPI_LONG_DOUBLE_INT ||MPI::LONG_DOUBLE_INT ||

'''but should read'''

||||= Datatypes for reduction functions (C and C++) =|| ||C type: MPI_Datatype ||C++ type: MPI::Datatype || ||Fortran type: INTEGER || || ||MPI_FLOAT_INT ||MPI::FLOAT_INT || ||... ||... || ||MPI_LONG_DOUBLE_INT ||MPI::LONG_DOUBLE_INT || ||MPI_QUAD_INT (optional) ||(use C datatype handle) || ||MPI_FLOAT128_INT (optional) ||(use C datatype handle) ||

Alternative Solutions

None.

Impact on Implementations

Part 1: The new optional C dataypes MPI_QUAD, MPI_FLOAT128, MPI_C_QUAD_COMPLEX, MPI_C_FLOAT128_COMPLEX must be implemented, if available by the C compiler. Part 2: The new optional datatypes for MPI_MAXLOC and MPI_MINLOC must be implemented: MPI_QUAD_INT and MPI_FLOAT128_INT.

Impact on Applications / Users

None.

Entry for the Change Log

Sections 3.2.2, 5.9.2, 5.9.4, 13.5.2 Table 13.2, and Annex A.1.1 on pages 27, 164, 167, 433, and 513.[[BR]] New named optional predefined datatypes MPI_QUAD, MPI_C_QUAD_COMPLEX, MPI_FLOAT128, and MPI_C_FLOAT128_COMPLEX for the C types {{{_Quad}}}, {{{float128}}}, {{{_Quad}}} {{{_Complex}}}, and {{{float128}}} {{{_Complex}}}, and MPI_QUAD_INT and MPI_FLOAT128_INT for the reduction operations MPI_MAXLOC and MPI_MINLOC.

jeffhammond commented 8 years ago

@RolfRabenseifner said:

This ticket is a needed correction to the standard to be consistent with the evolution of C.

Questions:

Because it is probably a needed correction in the chapters appLang (Rolf) and coll (Adam), I would take this ticket as owner, and added Adam also to the CC list.

jeffhammond commented 8 years ago

@jedbrown said:

jeffhammond commented 8 years ago

@RolfRabenseifner said:

Related chapters are

This correction is needed based on the type enhancements in the C language since MPI-2.2.

jeffhammond commented 8 years ago

@jedbrown said:

There are good scientific reasons to use quad precision. Due to lack of language standardization (even in Fortran, which MPI supports), users are willing to jump through some hoops to use it. Although some applications use quad precision in production, a common scenario is that quad precision is used to perform a test that tells the user whether implementing a much more complicated algorithm will pay off. The full implementation, if deemed worth the effort (which may be very large), would be formulated in such a way that double precision is sufficient. By offering no way to use quad precision with one-sided, you either

If we agree that these are not acceptable outcomes, we have to provide some means for using quad precision with one-sided. Note that we would not be having this discussion if MPI-2 had not crippled datatypes for one-sided (and if MPI-3 had not further reinforced that it shall remain crippled). If you are going around breaking kneecaps, but still want to claim to be a responsible state, the least you can do is offer crutches.

jeffhammond commented 8 years ago

@RolfRabenseifner said:

The discussion (partially off-line) showed that MPI_REAL16_INT does not exist because there exists a standardized method in Fortran to define 16 byte REALs. MPI_REAL16 exists only for legacy code reasons.

In C, such a method does not exist. Therefore the optional datatypes MPI_QUAD and MPI_FLOAT128 have the quality of being optional because of a lack in the C standard.

As a consequence, in Part 2, MPI_QUAD_INT and MPI_FLOAT128_INT was added.

jeffhammond commented 8 years ago

@wgropp said:

I disagree that MPI should provide support for datatypes that are not part of the hosting language. The danger here is that, since there are already several ad hoc names for the types, and, as has been noted, there are some reasons to support these, it is possible, even likely, that C will add datatypes for 16 byte floating point. At that point, it would be nice if the MPI Datatype name was roughly the same as the C name. We don't know yet what that name will be.

I realize that it is sometimes inconvenient that standards are slow to change, but trying to anticipate a change is dangerous. Note that this doesn't give true portability, just portability to some systems that use some similar non-standard compilers.

And there are workarounds for users. A contiguous type can be defined with MPI_BYTE to represent a 16-byte floating point. User-defined reduction operations provide support for collective computation. Similar workarounds were used for years for the plethera of new named types in C before MPI 2.2.

jeffhammond commented 8 years ago

@jedbrown said:

Bill, there would be no justification for these non-standard predefined types if we were only concerned with collectives, because as you say, predefined types are nothing more than convenience. The problem is entirely for one-sided communication which the Forum allowed in without support for user-defined types and operations. Without these predefined types and without de-crippling one-sided (not viable in the short term), no user of quad precision can use one-sided.

There are a few places in PETSc where one-sided would be attractive, but it cannot be used because we support quad precision. I'm sure there are other projects in a similar position.

jeffhammond commented 8 years ago

@wgropp said:

Jed, I understand where you are coming from, but I still do not believe this is the correct solution. Since this relies on nonportable features in the host language, it doesn't belong in the standard. MPI implementations are free to add extensions (but not using the MPI_ prefix) to support non-standard datatypes, and that would permit experimentation with the features. This would in fact be consistent with what you see in the C compilers - the C standard doesn't have these types, but two compilers have implemented them, carefully using names with a leading underscore to make it clear that these are not standard C names. The MPI version of this would be MPIX__QUAD or MPIX___FLOAT128.

One of the greatest threats to a standard is premature standardization, and this is to me a clear case of that.

jeffhammond commented 8 years ago

@RolfRabenseifner said:

We support a long list of non-standardized datatypes in Fortran and nobody. Bill, you are absolutely right that we should not block the name MPI_QUAD for a non-standardized feature.

On the other hand, users of _Quad do not want to see n different names in n different MPI libraries. This was the reason of standardizing MPI_REAL8.

I good compromise between not using MPI_QUAD and Jed's needs and the way we did for the Fortran community is:

We use MPIQUAD, MPI___FLOAT128, MPI_CQUAD_COMPLEX and MPI_C___FLOAT128_COMPLEX.

Or there may be still a better name or name scheme to be used for these nonstandard C types.

If the C standardization is doing _Quad or __float128, we can keep the names, if they do Quad then we add a new MPI_QUAD. I would say this is a good compromise and solution that includes Bill's concern and Jed's needs.

jeffhammond commented 8 years ago

@wgropp said:

True, the "real*8" types were non-standard, but the Fortran community had agreed on their name - most compilers supported these length types, and they all used the exact same name. This is clearly not the case yet for the quad types.

Another issue that I have with this is that it would be the only C floating point type with a defined length. float, double, and long double are only defined in relation to each other. If _Quad was used, it should follow the same sort of definition and avoid any specific length. If _float128 was used, we should include the others (and using the number of bits is a bit weird - why not the number of bytes and use 4,8,10,12,16)? I'm not really suggesting this - just that if we do got this route, we should be consistent and not just include one exceptional type. One of the reasons that MPI was not too hard for users to learn is that there were a small number of concepts. Adding individual features adds to the complexity.

jedbrown commented 8 years ago

Note that ICC 13 and later support __float128 as equivalent to _Quad and conforming to the IEEE 754-2008 128-bit spec.

jeffhammond commented 8 years ago

@jedbrown The more interesting development here is that Fortran 2008 has a proper 128b floating-point type that is not supported by MPI without assuming the equivalence of REAL*16 and real(kind=REAL128).

Below is a Fortran 2008 program that cannot be generalized to use all MPI built-in reductions, because MPI_Type_create_f90_real does not work with the pair types.

subroutine mydotf128(n,x,y,z)
    use ISO_FORTRAN_ENV
    implicit none
    integer, intent(in) :: n
    integer :: i
    real(kind=REAL128), intent(in) :: x(n),y(n)
    real(kind=REAL128), intent(out) :: z
    real(kind=REAL128) :: r
    r = 0.0
    do i=1,n
        r = r + x(i) * y(i)
    enddo
    z = r
    return
end subroutine mydotf128