Open jeffhammond opened 2 years ago
This part of the ABI goes beyond a pure ABI spec and adds new functions for converting handles.
@jeffhammond We had a question about those handle conversion procedures.
The C part of the ABI defines handles as incomplete struct pointers, for example, (struct MPI_ABI_Comm*)
, which makes them pointer-sized.
The conversion procedures require serialisation into int
, which is not usually pointer-sized.
** Shouldn't these procedures use MPI_Aint
instead?
This type is well-defined in Fortran and in C, so it should serve all the purposes of the conversion and it doesn't require 64bit->32bit conversion in either direction.
It does not work because of the definition of handles in Fortran, which are INTEGER, either raw or inside of a type.
So, at some point, either inside these new functions or somewhere else, MPI will need to assign a 32-bit integer (possibly by hashing the 64-bit pointer) and recover the 64-bit pointer (possibly by looking up the 32-bit integer in a hash table)?
This is already needed (for implementations that have 64-bit handle types in C), so it is not a new requirement. Is that the line of argument?
Yes. This is a trivial modification of the API that requires no change to implementations, with the upside that the C ABI won't depend on the Fortran compiler configuration.
Problem
I create a standalone implementation of the MPI F08 module to learn some things about writing language interfaces to MPI.
I worked around a lot of problems like hard-coding
MPI_Status
to the implementation ABI, which is easy enough to do, although would not be necessary with an ABI.I can mostly deal with compile-time constants, using my own values and doing on-the-fly conversions back to the underlying implementation's C values.
The place where I cannot deal with this is in user-defined reductions. When I have my own reduction, it takes a datatype argument based on the Fortran type definition in my module. When I pass
MPI_INTEGER
in Fortran, it uses my compile-time constant (-1003), which is converted to the C value (MPICH's in this case is 1275069467 in decimal), because that is what I have to pass to the C call toMPI_Allreduce
. The latter is what my user-defined reduction sees, which is not useful since my Fortran has no idea what the MPICHop % MPI_VAL
means.Because MPICH uses integers, I could hack this and do the back-conversion secretly in my F->C layer, only for user-defined reductions with built-in datatypes, although this won't work if MPICH checks the handle for validity. It also doesn't work with Open-MPI.
I can avoid this problem by doing such as reimplementing all the collectives from scratch in Fortran when user-defined reductions are involved, but that's unreasonable. Another workaround would be to generate a user-defined datatype on-the-fly when the user passes a built-in type with a user-defined reduction, which is probably what I'm going to do, but this should not be necessary.
Update: I can't do the on-the-fly type swap trick because then user code won't get the type they passed at the command-line, so if they have logic to check that (which is common), it's going break.
Proposal
This is one of many reasons we should have an ABI. The MPI F08 module was designed to be implementable as a standalone component, but we have failed to make that possible, which means that our ecosystem requires one to compile an entire MPI implementation from scratch just to get a Fortran module, which is an awful user experience because many platforms support multiple Fortran compilers, and building any MPI implementation from source is not quick (https://github.com/open-mpi/ompi/issues/2056).
Changes to the Text
Impact on Implementations
Impact on Users
References and Pull Requests