MPI+ with HIPFort - Githubissues

fluidnumerics-joe commented 3 years ago

Hey all, I wanted to reach out to discuss ideas on how to handle MPI+HIPFort . Ideally, I'd like to be able to pass GPU data (c_ptr) over MPI so that GPU direct communications can be used.

However, it's looking like I'd have to duplicate the MPI code I have written in Fortran in C++ and provide another ISO_C_BINDING wrapper to call it from Fortran. While I've gotten to be ok with mixed language programming for writing GPU kernels, I'd prefer not to have to duplicate code to be able to use GPU direct communications.

Alternatively, I can go for a method of copying device pointers back to the host, handle MPI exchanges on the host, and copy back to GPU; for performance/scalability, this is less than ideal.

I'm interested in discussing with you all any alternative options, or if this kind of wrapper for MPI would be a good candidate for HIPFort.

domcharrier commented 3 years ago

Hi @fluidnumerics-joe,

Did you try c_f_pointer(mycptr, myfptr) yet and passing the myfptr to the MPI routine? That worked in one project where were using ROCm-aware OpenMPI (+UCX) ... if I remember correctly. So would be great if you could try it out and add a test/example to the existing collection ;) I would assume the myfptr as rank-1 array but scalars might work as well (never tried).

fluidnumerics-joe commented 3 years ago

@domcharrier , I haven't tried that yet; I'll give it a shot. I recall some discussion a while back about HIPFort allowing Fortran pointers to be used to handle device variables, rather than c pointers. Am I misremembering or is this something we're looking into ?

domcharrier commented 3 years ago

@fluidnumerics-joe

I recall some discussion a while back about HIPFort allowing Fortran pointers to be used to handle device variables, rather than c pointers.

That is already implemented for a while now.

see e.g. the README.md section about it,
or the overloaded hipMalloc interfaces here: https://rocmsoftwareplatform.github.io/hipfort/interfacehipfort__hipmalloc_1_1hipmalloc.html
or the overloaded hipblasSaxpy interfaces here (all math libs have these fptr interfaces): https://rocmsoftwareplatform.github.io/hipfort/interfacehipfort__hipblas_1_1hipblassaxpy.html

You can also directly look at the examples in here: https://github.com/ROCmSoftwarePlatform/hipfort/blob/master/test/f2008/

The cool stuff is that you can also slice these array pointers, so you don't need to compute any byte offsets when you want to address a subarray.

fluidnumerics-joe commented 3 years ago

@domcharrier - Got it. This definitely cleans up allocation and memcpy calls (no more explicitly stated sizes!)

I take it if I go this route, with Fortran pointers, when I call HIP Kernels from the fortran side, the fortran pointers that are passed to the kernels will need to wrapped with C_LOC(). The vecadd example still uses c_ptr, but I see you're able to overload the accelerated libraries to accept fortran pointers.

domcharrier commented 3 years ago

I take it if I go this route, with Fortran pointers, when I call HIP Kernels from the fortran side, the fortran pointers that are passed to the kernels will need to wrapped with C_LOC().

I don't think that is actually necessary. You could also not specify the interface on the Fortran side at all. Instead, you could specify a C function with _ suffix, e.g.: mycfunction_ instead of mycfunction to mimic the Fortran function symbol naming scheme.

With HIP and the accelerated math libraries this does not work, as we already have established C function names without the _ suffix.

fluidnumerics-joe commented 3 years ago

That's pretty clever. I'll give that a shot. If it works out, I can update the vecadd example, if this is the team's recommended strategy for working with HIP kernels in fortran

fluidnumerics-joe commented 3 years ago

@domcharrier Without defining the interface block, how is the Fortran side made aware of the the routines defined in .cpp files ?

domcharrier commented 3 years ago

@fluidnumerics-joe

Without defining the interface block, how is the Fortran side made aware of the the routines defined in .cpp files ?

By default, Fortran compilers like gfortran will just assume that called subroutines or functions exist during compilation (as it can derive the signature (including return values) from the call site). Only at the linker stage, it will notice/know that a symbol is missing / present.

There are some switches to change this default behavior of gfortran:

-Wimplicit-interface        Warn about calls with implicit interface.
-Wimplicit-procedure        Warn about called procedures not explicitly declared.

That's pretty clever. I'll give that a shot. If it works out, I can update the vecadd example, if this is the team's recommended strategy for working with HIP kernels in fortran

It is shorter but I wouldn't call it more clever as the Fortran compiler will assume how the interface of the called function will look like and this assumption must be met by you, the C programmer, to prevent compilation and runtime issues. It is not just the <name>_ that you need to tailor to the expectations of the Fortran compiler. You further need to make sure that all arguments in your C function are pointer types, even though you might pass a scalar on the Fortran side e.g. With the explicit interface ... end interface declaration in Fortran, you overwrite assumptions of the Fortran compiler how the interface of a (C) procedure looks like.

GCC explains all this here: https://gcc.gnu.org/onlinedocs/gfortran/Interoperability-with-C.html

Was quite an interesting resource for our latest Fortran project, which tries to generate HIP C++ kernels out of CUDA Fortran and OpenACC Fortran: https://github.com/ROCmSoftwarePlatform/gpufort

domcharrier commented 2 years ago

@fluidnumerics-joe

Regarding the interfaces on the Fortran side, check this out:

https://github.com/ROCmSoftwarePlatform/hipfort/commit/24fbd74e0f40c35519de9130163c7ecdd4e8aefa

Seems that C++ "&" can help to get rid of a lot of awkward pointer syntax on the C++ side.

ROCm / hipfort

MPI+ with HIPFort #71