mpi-forum / mpi-issues

Tickets for the MPI Forum
http://www.mpi-forum.org/
66 stars 7 forks source link

should the ABI support mpif.h? #834

Open jeffhammond opened 6 months ago

jeffhammond commented 6 months ago

Problem

mpif.h was deprecated in 4.1. If I understand correctly, we do not need to worry about deprecated features anymore. This means that the ABI standardization effort can ignore mpif.h.

The reason this matters is that buffer sentinels must be implemented using Fortran COMMON, which itself has been obsolescent since Fortran 90 (34 years ago).

If we permit ourselves to ignore mpif.h from the ABI effort, we can avoid using COMMON, because the sentinels will be in a module (used by both the MPI and MPI_F08 modules).

A side effect of this is that it motivates Fortran users to stop using mpif.h in order to get the ABI. The downside is that way too many people still use mpif.h and if they stubbornly refuse to change, then it hurts ABI adoption.

Proposal

I want the MPI Forum to express its opinions about this decision.

Changes to the Text

TBD

Impact on Implementations

Not having to worry about mpif.h for the ABI effort might save some effort, but it's not a big deal.

Impact on Users

Users of mpif.h will not be able to adopt the standard ABI.

References and Pull Requests

958

devreal commented 6 months ago

There is precedent in excluding mpif.h from new features in big count. Personally, I'm OK with not providing an official ABI for mpif.h with the hint that it is deprecated and will be removed in a future release.

jprotze commented 6 months ago

I'd welcome an explicit statement that the feature is applied or not applied to deprecated features.

Not specifying that const-ness is not/ applied for deprecated functions was annoying for tools.

bkmgit commented 6 months ago

Many Fortran users would likely benefit from ABI support as it will ease code deployment. Compiling for older Fortran versions has typically been more reliable. Thus people with programs they wish to run without modifications may choose to stay with older implementations of the MPI standard and recompile as needed. Fortran developers are not heavily engaged with the MPI Forum, and there are options such as CAF which some developers may choose for new programs. It maybe helpful to directly seek input from parallel Fortran code developers, for example from Quantum Espresso, maybe creating a list of example Fortran MPI codes is helpful - a MPI Fortran Users group list in addition to the MPI Fortran WG.

eschnett commented 6 months ago

Three large projects that currently use mpif.h (and do not support mpi.f90) are HDF5, SCALAPACK, and WRF.

PETSc and Quantum Espresso both support mpi.90.

jeffhammond commented 6 months ago

It maybe helpful to directly seek input from parallel Fortran code developers

https://fortran-lang.discourse.group/t/should-the-mpi-abi-support-mpif-h/

jeffhammond commented 6 months ago

Three large projects that currently use mpif.h (and do not support mpi.f90) are HDF5, SCALAPACK, and WRF.

HDF5 uses mpif.h only in a configure test and an example: https://github.com/search?q=repo%3AHDFGroup%2Fhdf5%20mpif.h&type=code.

ScaLAPACK only uses MPI Fortran support in test programs, not the library itself: https://github.com/search?q=repo%3AReference-ScaLAPACK%2Fscalapack%20mpif.h&type=code. Given that the test programs are not normally part of an installation, this is a minor problem.

WRF is so good at MPI Fortran, it deletes mpif.h as part of make clean 🤣 : https://github.com/wrf-model/WRF/blob/a8eb846859cb39d0acfd1d3297ea9992ce66424a/var/Makefile#L3.

In any case, I can fix ScaLAPACK and HDF5 easily enough. I don't know whether or not the WRF people can be helped.

PETSc and Quantum Espresso both support mpi.90.

raffenet commented 6 months ago

https://github.com/Nek5000/Nek5000 uses mpif.h.

eschnett commented 6 months ago

HDF5 is known for breaking their API in minor (and sometimes even patch) releases. Thus updating to a newer HDF5 version isn't always straightforward.

cniethammer commented 6 months ago

I am fine with not providing an official ABI for mpif.h with the hint that it is deprecated. I'd love to finally see the removal of mpif.h and happily support codes in transitioning to use mpi(_f08).

That said I just opened a PR for HDF5 ...

eschnett commented 6 months ago

The question isn't really whether there is an "ABI for mpif.h", it's more "will it be possible for implementors to provide a file mpif.h that works with the ABI". That is, the question isn't "should we define it in the standard", but rather "are we knowingly defining an ABI that is (almost) impossible to implement in mpif.h".

jeffhammond commented 6 months ago

The practical question boils down to using COMMON or not. If we put sentinels in COMMON, mpif.h is easy but we use obsolescent Fortran. If we put sentinels in a module, we can't do mpif.h at all with that. However, implementations could still support it by adding duplicate sentinels in COMMON and checking for both of them in the layer that calls C.

If we are going to delete mpif.h in MPI 5, it sucks to have to design the ABI around it.

eschnett commented 6 months ago

All current MPI implementations use common blocks in their current implementation. That makes it easy for them to support an ABI that also uses common blocks; implementations "only" have to redefine the constants. Switching away from this adds a burden.

devreal commented 6 months ago

mpif.h will hopefully be removed eventually, ideally after all main users have moved on. Not providing an ABI for it does not break existing codes and is (another) motivation for people to transition away from it. It's easy to do (see @cniethammer's PR against HDF5) and well documented in the standard. Trying to provide an ABI for mpif.h is wasted time and potentially counter-productive.

jeffhammond commented 6 months ago

All current MPI implementations use common blocks in their current implementation. That makes it easy for them to support an ABI that also uses common blocks; implementations "only" have to redefine the constants. Switching away from this adds a burden.

It takes 2 minutes to implement in a module.

COMMON has been obsolescent for 34 years.

hppritcha commented 6 months ago

I would also prefer not to pursue ABI standardization for mpif.h for similar reasons to those noted above.

eschnett commented 6 months ago

If an MPI ABI cannot support mpif.h, then the MPI standard should be updated in one of these ways: (1) Change section 19.1.1 "Support for Fortran – Overview" to loosen the restriction that both of mpif.h and mpi.f90 need to be supported if either of them is (that seems a simple choice). (2) Make an explicit statement (maybe a note to implementors) in the description of the MPI ABI that this ABI can only be implemented in a standard-conforming way in an mpi_f08.f90 (that would be a far-reaching choice).

wrwilliams commented 6 months ago

(1) Change section 19.1.1 "Support for Fortran – Overview" to loosen the restriction that both of mpif.h and mpi.f90 need to be supported if either of them is (that seems a simple choice).

This to me seems like a correct next step for deprecation of mpif.h regardless of any ABI concerns, no?

jeffhammond commented 6 months ago

MPI.mod will support the ABI. I will not bend on this. Over half of our users are using that version of the API.

cblaas commented 6 months ago

I'm for ignoring mpif.h in the ABI standardization effort with the hint that it is deprecated. It does not break anything, old stuff will work as it has worked before. And it might give a mild incentive to upgrade at least to the mpi module.

devreal commented 6 months ago

Section 19.2 (Support for Large Count and Large Byte Displacement in MPI Language Bindings) has this text:

In older Fortran bindings (mpif.h (deprecated) and use mpi), no new interfaces and no new specific procedures for larger types are provided beyond what existed in MPI-3.1; all MPI procedures have the same types as in the versions prior to MPI-4.0.

Something similar could be added to whatever section we'll have about the Fortran ABI:

The old Fortran bindings mpif.h (deprecated) do not support any standard ABI. Users are encouraged to transition to USE mpi, as described in Section 19.1.4.

eschnett commented 6 months ago

I disagree with the statement "It does not break anything". Technically this is true. However, the main point of a standard ABI is to simplify interoperability between different MPI implementations without requiring people to change their codes. The switch could be as simple as switching from module load mpich to module load mpich-mpiabi before compiling. This would be a near-transparent change for all MPI users.

Yet, if people now find that they'll have to modify their codes be compatible with the newest (yet-to-be-released) version of HDF5, or have to patch HDF5, or have to make (straightforward, but still) changes to their codes, then this will hinder acceptance of the ABI. Building and installing codes is a complex task, and upgrading dependency versions is non-trivial.

If you want to push people away from using mpif.h then put this into the standard – say clearly that no MPI 5.0 implementation may provide an mpif.h any more. Tying this change to using a common ABI is not necessary. All current MPI implementations currently use common blocks, and this is working fine. Both MPICH and OpenMPI have made public guarantees that their respective ABIs are stable. There is no reason that the MPI standard couldn't base a common ABI on these two proven implementations. The only necessary choice is the name of the common block.

What seems to happen here is that a standardization process (a common ABI) that has the chance to greatly benefit the community is co-opted for another goal, namely asking people to modernize their codes. The MPI standard chose a time scale for phasing out mpif.h, and introducing a common ABI isn't a good reason to accelerate this time frame.

devreal commented 6 months ago

Technically this is true.

That is all I care about. Codes will run the same way they run today. Want an ABI? Change to use mpi.

The switch could be as simple as switching from module load mpich to module load mpich-mpiabi before compiling. This would be a near-transparent change for all MPI users.

You will still get mpif.h in mpich-mpiabi but you have no guarantee that it's ABI compatible with ompi-mpiabi. That is the same behavior you have today with mpich and ompi. It doesn't get worse, you just don't get the improvements that the cool kids using use mpi get.

If you want to push people away from using mpif.h then put this into the standard – say clearly that no MPI 5.0 implementation may provide an mpif.h any more.

Absolutely not. The day we stop shipping mpif.h we break codes. That is the reason it has been deprecated, not removed yet. Maybe people will find an hour time in the next decade to change three lines of code. I'd hope so.

The MPI standard chose a time scale for phasing out mpif.h, and introducing a common ABI isn't a good reason to accelerate this time frame.

There is no official timeline, and that is part of the problem here. HPC moves at glacial speeds so anything that helps us get rid of a feature whose use has been discouraged for a decade within the next two decades is welcome.

By not providing an ABI for mpif.h instead of removing mpif.h altogether we're playing nice with the lazy kids.

jeffhammond commented 6 months ago

The ABI can be implemented in mpif.h without using COMMON if we ignore sentinels. MPI_IN_PLACE is the only one I expect is used in Fortran codes.

As I've noted before, any implementation can support the ABI including sentinels in mpif.h if they create a second set of sentinels and check for them, which won't be portable in the way that the ABI is but allows vendors to support the truly obnoxious users if they want.

eschnett commented 6 months ago

The MPI standard doesn't allow multiple values for sentinels in Fortran. See e.g. section 19.3.6 "MPI Opaque Objects"; this describes how buffer addresses can be converted between C and Fortran. This section clearly assumes that the value of MPI_BOTTOM is the same in all Fortran APIs (mpif.h, mpi.f90, mpi_f08.f90). This assumption is also implicit across the standard.

jeffhammond commented 6 months ago

I'm aware of that, but don't really care, since I don't see a reasonable scenario where it matters.

eschnett commented 6 months ago

An MPI implementation needs to be able to provide the ABI without breaking the MPI standard. The suggestions regarding mpif.h made above are:

jeffhammond commented 6 months ago

the easiest change is to omit mpif.h from the ABI. i'm not aware of any serious objections to deleting it in MPI 5.0. we could always pull that in and do it in 4.2.

jprotze commented 6 months ago

@hzhou posted another option on the mpich mailing list: make mpi.h a wrapper for use mpi.

Would this be a feasible way to provide backwards compatibility for legacy codes assuming compilers would accept use mpi wherever the code includes mpi.h?

eschnett commented 6 months ago

Unfortunately it is not possible to make mpif.h a wrapper for use mpi. Fortran syntax requires that use statements come before implicit none, and variable declarations (and thus include "mpif.h") need to come after implicit none. There is thus no way to have a use statement in mpif.h, which makes implementing mpif.h so much more difficult.

hzhou commented 6 months ago

@eschnett Thanks for pointing that out.