eth-cscs / DLA-Future

DLA-Future
https://eth-cscs.github.io/DLA-Future/master/
BSD 3-Clause "New" or "Revised" License
64 stars 14 forks source link

Add spack variant for MPI GPU support #1102

Closed msimberg closed 7 months ago

msimberg commented 8 months ago

I've initially set it to force contiguous buffers always (it's a cmake_dependent_option so it's off if DLAF_WITH_MPI_GPU_SUPPORT is off).

Suggestions for naming etc. welcome.

msimberg commented 8 months ago

cscs-ci run

msimberg commented 8 months ago

cscs-ci run

msimberg commented 8 months ago

I can just think of GPU-aware MPI, so something like mpi-gpu-aware or gpu-aware-mpi. But not sure it is a good idea to "separate" the variant name from the actual config option in CMake.

Yeah, I mainly wanted to keep them consistent, but "GPU-aware MPI" is also a good option. Note that we haven't released a version with the fixed CMake variables nor pushed the option upstream to spack, so there's still room to rename it. "GPU-aware MPI" was not an option in https://github.com/eth-cscs/DLA-Future/pull/1088, but I think it probably should've been an option.

msimberg commented 8 months ago

I can just think of GPU-aware MPI, so something like mpi-gpu-aware or gpu-aware-mpi. But not sure it is a good idea to "separate" the variant name from the actual config option in CMake.

I like gpu-aware-mpi as suggested by @albestro (also in CMake), sounds a bit clearer. But I don't really mind.

In that case I suppose the alternative would be:

I like this as well and I'll happily change it. @rasolca, @aurianer, thoughts? I think in #1088 I was quite biased by the MPICH environment variable MPICH_GPU_SUPPORT_ENABLED.

It's not worth to expose DLAF_WITH_MPI_GPU_SUPPORT_FORCE_CONTIGUOUS to users?

I wouldn't at this point but happy to be overruled. My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more. Note that the CMake variable also defaults it to on. If we have enough results to support that it never makes sense to disable it we could just remove the CMake option.

RMeli commented 8 months ago

My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more.

Yes, I totally get this. My reasoning is that I often compile DLA-Future as a dependency in Spack, and having these options exposed is often useful. But I'm likely the only (?) person to benefit from this (i.e. an actual CP2K user should not care about a "debugging switch"), so I'm happy to leave it out if that's less confusing/error prone. (I can always add a custom Spack package on my end if needed... And make @albestro cringe ;) ).

If we have enough results to support that it never makes sense to disable it we could just remove the CMake option.

I would consider this carefully. Isn't it possible that in the future the situation will be reversed?

msimberg commented 8 months ago

My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more.

Yes, I totally get this. My reasoning is that I often compile DLA-Future as a dependency in Spack, and having these options exposed is often useful. But I'm likely the only (?) person to benefit from this (i.e. an actual CP2K user should not care about a "debugging switch"), so I'm happy to leave it out if that's less confusing/error prone. (I can always add a custom Spack package on my end if needed... And make @albestro cringe ;) ).

Yeah, I get this too... I suppose this is a bit similar to the hdf5 variant. I suppose I'm mostly trying to avoid the situation where we add a variant that we then want to change/remove/rename in the upstream spack package. I was at some point considering if we could instead make this a runtime option instead and in that case we wouldn't need the variant. That said, this is not a hill I want to die on, so if it's useful it's useful and we add it.

msimberg commented 7 months ago

cscs-ci run

msimberg commented 7 months ago

cscs-ci run

msimberg commented 7 months ago

cscs-ci run