Reference-ScaLAPACK / scalapack

ScaLAPACK development repository
Other
127 stars 55 forks source link

MpiInt is wrong and should be removed #83

Open jeffhammond opened 12 months ago

jeffhammond commented 12 months ago

From https://github.com/Reference-ScaLAPACK/scalapack/pull/19#issuecomment-1731388442:

Furthermore, the MPI C interface is newly assumed to expect the type MpiInt, which is a macro again, by default evaluates to int, but can be (in hypothetical and very unlikely circumstances) overridden by the user as well. However, I know of no MPI C library providing other than LP64 interface, so this freedom will be most likely never used.

Sorry, but this is just wrong. The MPI standard uses int and no other type is conforming. There is no scenario involving a valid implementation of MPI where MpiInt should ever be used.

The MPI standard has MPI_Fint specifically to ensure compatibility with the MPI Fortran library, and only this type should be used for this purpose.

I don't have strong feelings about Int but MpiInt should be removed. Its existence is dangerous and unnecessary.

haidarazzam commented 6 months ago

@jeffhammond @albandil @langou I think there is 2 main issues here.

  1. First let's ignore MPI. the BLACS functions BI_XXX are scalapack functions and are called from within the Fortran or C routines, thus if Fortran and C have their Int compiled as 64bits integer, then the arguments to the BIXXX functions are of typedef 64bits and thus should be Int not MpiInt to avoid erroneous and confusion. Moreover for example dgamx2d.c line 20 we have len[0] = len[1] = N; len is of type MpiInt while N is of type Int. I believe we should fix this to have both of type Int, and if MPI provide any 64bit API, it should be handled inside the BI_XXX function.

  2. related to the above, I am not sure today how compiling scalapack for 64bit will work, because currently Int is set by the user to 64bits integer while MpiInt is hardcoded to int, I guess the compiler is doing the automatic casting but this can be a source of a lot of bugs in particular if the int64 that is passed to MPI overflow. I suggest immediate fix for it, by casting all int64 to int

  3. Now let talk the MPI issue. according to my understanding, all MPI installation (standard) are 32bit API, thus we need to cast and check for overflow all calls to MPI. casting in the C API is easy but I am not sure about the Fortran call to MPI, we may need to replace all Mpval by a new variable that will be cast. The other solution for 64bits is for user to use their own compilation. OMPI allows compiling it for 64bit but I don't know about MPICH. Also there is https://github.com/jeffhammond/BigMPI . Anyway, I don't think users would want to compile their own MPI which can result in a bad config and usually it is not preferred on suprcomputer.

  4. @jeffhammond I am not sure how MPI_Fint could help here?

jeffhammond commented 6 months ago
  1. Yes, BLACS can do whatever it wants as long as the C code safely casts to int before calling MPI C API.
  2. It is a good idea to have a macro to cast safely from 64 to 32 bits. NWChem was burned by this at one point when we tried to broadcast more than INT_MAX elements.
  3. BigMPI is just a prototype of MPI-4 large-count support, which isn't implemented in Open MPI yet. For this, one could define MpiInt to MPI_Count but then all the C API symbols would need the _c suffix.
  4. MPI_Fint doesn't matter in this specific situation but I wanted to mention it anyways, because it is used in C-Fortran interoperability APIs.
haidarazzam commented 6 months ago

Suggestion: I would suggest to replace all MpiInt originally int into Int (similar to when int was replaced by Int), so we know all integer computation and indices work with the same typedef (32 or 64), and then any call to MPI_XXX must be replaced by a wrapper WRAPPER_MPI_XX with Int used as input argument and inside it, got cast/verified_below_MAX_INT then call the corresponding MPI call.

This way, we gain two folds.

  1. first we will be able to check the return status of MPI (that was never done in BLACS) and fail with BI_BlacsErr in case of error
  2. second check if any int is above MAX_INT then we can either fail as unsupported for now and later if we found that scalapack need above MAX_INT communication, then we might figure out a solution learning from @jeffhammond work for BIGMPI. @jeffhammond I am happy to discuss the solution provided in BIGMPI to learn more how you provided a workaround so maybe we can add similar functionality in the wrappers. A naive solution that might work would be to create our own dataType (which wil work for point to point comm) but for global comm we might split the operation (Reduce, BACST) over chunk that fit MAX_INT, however it might be complex for async comm. Anyway I don't think scalapack needs above MAX_INT comm as most of the comm work for NB block.

I compiled a list of MPI function that are called from BLACS and it seems there is about 41 functions that need wrappers. any suggestions are welcome.

MPI_Barrier MPI_Error_class MPI_Finalize MPI_Abort MPI_Init MPI_Initialized

MPI_Allreduce MPI_Bcast MPI_Irecv MPI_Isend MPI_Recv MPI_Reduce MPI_Rsend MPI_Send MPI_Sendrecv MPI_Testall MPI_Waitall

MPI_Comm_f2c MPI_Comm_c2f MPI_Comm_create MPI_Comm_dup MPI_Comm_free MPI_Comm_get_attr MPI_Comm_group MPI_Comm_rank MPI_Comm_size MPI_Comm_split MPI_Group_free MPI_Group_incl

MPI_Unpack MPI_Pack MPI_Pack_size MPI_Get_count MPI_Op_create MPI_Op_free MPI_Type_commit MPI_Type_create_struct MPI_Type_free MPI_Type_indexed MPI_Type_vector MPI_Type_match_size