LLNL / Aluminum

High-performance, GPU-aware communication library
https://aluminum.readthedocs.io/en/latest/
Other
84 stars 21 forks source link

No refenrence to 'hwloc_linux_parse_cpumap_file' #203

Closed hanfluid closed 1 year ago

hanfluid commented 1 year ago

Looks like the Auminum is only compatible witht the hwloc < 2.x. Otherwise, there will be this error during the building process.

ndryden commented 1 year ago

Can you provide more information about your configuration and build environment?

Aluminum builds without issue using hwloc >= 2.0.0 on our systems. Further, Aluminum does not use hwloc_linux_parse_cpumap_file anywhere in the code, so you may have an issue with how one of your dependencies (likely MPI) is configured.

hanfluid commented 1 year ago

module list Currently Loaded Modulefiles: 1) gcc/11.2.0 4) craype-x86-rome 7) cray-dsmml/0.2.2 10) cray-libsci/22.08.1.1 13) mpscp/1.3a 2) cray-hdf5-parallel/1.12.2.1 5) libfabric/1.11.0.4.125 8) perftools-base/22.09.0 11) cray-pals/1.2.4 14) PrgEnv-gnu/8.3.3 3) craype/2.7.17 6) craype-network-ofi 9) cray-mpich/8.1.19 12) bct-env/0.1 15) nvidia/22.3

Here is the command for cmake: "cmake -DHWLOC_DIR=/path/to/hwloc-2.9.1 -DCMAKE_INSTALL_PREFIX=/path/to/aluminum -DALUMINUM_ENABLE_CUDA=YES -DALUMINUM_ENABLE_MPI_CUDA=YES al_info ../".

And got the cpumap_file error while doing the make: "[ 40%] Building CUDA object src/CMakeFiles/Al.dir/cuda/helper_kernels.cu.o [ 43%] Linking CXX shared library libAl.so [ 43%] Built target Al [ 46%] Building CXX object util/CMakeFiles/al_info.dir/al_info.cpp.o [ 50%] Linking CXX executable al_info /usr/bin/ld: ../src/libAl.so.1.3.1: undefined reference to `hwloc_linux_parse_cpumap_file'"

ndryden commented 1 year ago

It looks like you're manually specifying the hwloc library — is there a reason to do this? CMake usually picks it up automatically, and mixing hwloc library versions can cause issues with other libraries that use it (e.g., MPI).

My suspicion is that this is a problem with a dependent library like MPI (or even hwloc) and multiple versions of hwloc are being linked in. (Unfortunately, MPI libraries in particular tend to be bad about this.)

ndryden commented 1 year ago

Closing. Please feel free to reopen if you have more details.