Oddity: configure and dlopen checks for CUDA support

@Akshay-Venkatesh Question for you about the CUDA support in Open MPI.

Per https://github.com/conda-forge/openmpi-feedstock/issues/42, we've been asked why there's configure-time checks for CUDA, but then we also dlopen("libcuda.so.1", ...) at runtime.

I.e., we do the dlopen() stuff because we didn't want to create a link-time dependency to libcuda -- e.g., if a cluster only has some nodes with GPUs (and the cluster only has the CUDA libraries installed on the nodes with GPUs).

But since we've taken that philosophy, why do we have configure-time tests for things like GDR? For example:

https://github.com/open-mpi/ompi/blob/548ed56befd5ecc843d8b3938bf272360003efee/opal/mca/common/cuda/common_cuda.c#L102-L111

Since we don't know / can't guarantee that the libcuda that configure checked is the same one that was dlopen()'ed, shouldn't we check for the things that configure is checking at run time, after we successfully dlopen("libcuda.so.1", ...)? I.e., shouldn't we dlsym() to see if the successfully-opened libcuda.so.1 has the functionality that Open MPI is looking for?

FYI @leofang @dalcinl @jakirkham

open-mpi / ompi

Oddity: configure and dlopen checks for CUDA support #7334