cea-hpc / wi4mpi

Wrapper interface for MPI
BSD 3-Clause "New" or "Revised" License
80 stars 15 forks source link

Support `MPI_Aint_add` and `MPI_Aint_diff` #44

Open spoutn1k opened 1 year ago

spoutn1k commented 1 year ago

Hey again,

Sorry for the multiple PRs, but I felt this one is a feature integration rather than the bug fix of my other one, thus deserving a separate PR.

This PR is meant to fix #18 by introducing support of the MPI 3.1 functions MPI_Aint_add and MPI_Aint_diff.

The encountered issue was that OpenMPI's implementation of those functions is a macro, making the symbols absent from the library. This means that when running with preload using --from MPICH/Intel --to OpenMPI, the application crashes with an undefined symbol.

This is resolved by overriding the symbols and preventing the ASM router from doing its work, and rather implementing the OpenMPI macro directly.

Addition to the header

I added the definition found in the MPI3.1 standard of the two functions to interface/interface_utils/include/mpi.h. I have no experience with fortran so I decided not to touch things on the fortran side.

Symbol contents

I implemented the overrides by making sure they matched the output of OpenMPI's mpicc. For the following code:

#include <mpi.h>

MPI_Aint get_diff(MPI_Aint lhs, MPI_Aint rhs) {
  return MPI_Aint_diff(lhs, rhs);
}

MPI_Aint get_sum(MPI_Aint lhs, MPI_Aint rhs) { return MPI_Aint_add(lhs, rhs); }

We get the following assembly, compiled using mpicc -c -g -O0:

$ objdump --disassemble=get_diff aint.o 

aint.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <get_diff>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   8:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
   c:   48 8b 55 f8             mov    -0x8(%rbp),%rdx
  10:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  14:   48 29 c2                sub    %rax,%rdx
  17:   48 89 d0                mov    %rdx,%rax
  1a:   5d                      pop    %rbp
  1b:   c3                      ret

$ objdump --disassemble=get_sum aint.o 

aint.o:     file format elf64-x86-64

Disassembly of section .text:

000000000000001c <get_sum>:
  1c:   55                      push   %rbp
  1d:   48 89 e5                mov    %rsp,%rbp
  20:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  24:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  28:   48 8b 55 f0             mov    -0x10(%rbp),%rdx
  2c:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  30:   48 01 d0                add    %rdx,%rax
  33:   5d                      pop    %rbp
  34:   c3                      ret

So simple additions and subtractions. This is reflected in the overrides: https://github.com/spoutn1k/wi4mpi/blob/96e458a0733c14ca80156db7ed0e39aa37fe2da1/src/common/override.c#L340-L357

When this applies

The override is enabled only when OpenMPI is used as target, not when any other combination of library is used. The override directives #define MPI_AINT_XXX_OVERRIDE are defined in the following files:

What's missing

This PR is my first real code contribution to the project, and I am unsure about certain things:

spoutn1k commented 1 year ago

Hmm for some reason github added my merge of master as individual commits ... Does anyone know how to fix that ?

adrien-cotte commented 1 year ago

Awesome work, as always. But, why not putting #define MPI_AINT_XXX_OVERRIDE into override.h ?

spoutn1k commented 1 year ago

Thank you :)

The idea is to only override the symbols when necessary. Putting the define in override.h would override the symbols project-wide, while this solution only overrides symbols when the target library is missing them (OpenMPI here).

That way, if another library decides to change the implementation of the symbol, it will be respected.

ducrotv commented 1 year ago

This PR is a good fix for this issue. After review we decided to fix this another way: we will add these functions to the generated ones. Thanks again for this work, it will serve as a base to our implementation.