ROCm / rocHPL

High Performance Linpack for Next-Generation AMD HPC Accelerators
Other
41 stars 20 forks source link

rocblas_initialize not declared in HPL_pdmatgen.cpp #2

Open yulingao opened 1 year ago

yulingao commented 1 year ago

When I compiled, I encountered this error: rocHPL-main/src/matgen/HPL_pdmatgen.cpp:122:3: error: ‘rocblas_initialize’ was not declared in this scope 122 | rocblas_initialize(); | ^~~~~~

noelchalmers commented 1 year ago

Any details about your ROCm install and system configuration? rocblas_initialize is part of the rocBLAS headers and has been since ROCm 3.6 IIRC.

yulingao commented 1 year ago

Hi Noel Chalmers, yes, as you said, I solved this error by changing rocblas to 3.9.1.

But another error occured.

[ 98%] Building CXX object CMakeFiles/rochpl.dir/src/timer/HPL_ptimer_walltime.cpp.o [100%] Linking CXX executable bin/rochpl /opt/root/usr/bin/ld: cannot find -lclang_rt.builtins-x86_64 collect2: error: ld returned 1 exit status make[2]: [CMakeFiles/rochpl.dir/build.make:1458: bin/rochpl-6.0.0.0] Error 1 make[1]: [CMakeFiles/Makefile2:76: CMakeFiles/rochpl.dir/all] Error 2 make: *** [Makefile:152: all] Error 2

And I can compile hip programs normally using hipcc. I'm not sure what's wrong.

-- The CXX compiler identification is GNU 9.3.0 -- Check for working CXX compiler: /public/software/gcc-9.3.0/bin/c++ -- Check for working CXX compiler: /public/software/gcc-9.3.0/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "1.8.3.1") -- Found BLAS: /public/home/rocHPL-main/tpl/blis/lib/libblis.so -- Found OpenMP_CXX: -fopenmp (found version "4.5") -- Found OpenMP: TRUE (found version "4.5")
-- Found MPI_CXX: /public/software/mpi/mpich/gcc/lib/libmpicxx.so (found version "3.1") -- Found MPI: TRUE (found version "3.1")
-- roctracer: /public/software/rocm/rocm-3.9.1/lib/libroctracer64.so -- roctx: /public/software/rocm/rocm-3.9.1/lib/libroctx64.so -- Found HIP: /public/software/rocm/rocm-3.9.1/hip (found version "3.9.20412-6d111f85") -- Looking for C++ include pthread.h -- Looking for C++ include pthread.h - found -- Found Threads: TRUE
-- rocBLAS version: 2.30.0.2773-91e553c8 -- rocBLAS include dirs: /public/software/rocm/rocm-3.9.1/rocblas/include -- rocBLAS libraries: /public/software/rocm/rocm-3.9.1/rocblas/lib/librocblas.so.0.1.30901 -- Configuring done -- Generating done

noelchalmers commented 1 year ago

Looks like the linker is looking for libclang_rt.builtins-x86_64. I'm not sure if this is needed, or if it's just missing in your system. Possibly this is related to this discussion: https://github.com/ROCm-Developer-Tools/HIP/pull/2219

Are these issues present if you use the latest ROCm v5.3? ROCm v3.9 is around 2 years old at this point.

yulingao commented 1 year ago

I found the ${ROCM_PATH}/llvm/lib/clang/12.0.0/lib/linux/libclang_rt.builtins-x86_64.a.

Therefore, I solved this error by specifying the include path.

target_link_options(rochpl PRIVATE "-L${ROCM_PATH}/llvm/lib/clang/12.0.0/lib/linux")

yulingao commented 1 year ago

BTW, does rocHPL support openmpi only? Do you support mpich3? I encountered this error at the end of the install.

CMakeFiles/rochpl.dir/src/HPL_pdinfo.cpp.o: In function HPL_pdinfo(int, char**, HPL_S_test*, int*, int*, int*, int*, HPL_T_ORDER*, int*, int*, int*, int*, int*, int*, HPL_T_FACT*, int*, int*, int*, int*, int*, HPL_T_FACT*, int*, HPL_T_TOP*, int*, int*, HPL_T_SWAP*, int*, int*, int*, int*, int*, double*)': HPL_pdinfo.cpp:(.text+0x6c): undefined reference toompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x7b): undefined reference to ompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x4d5): undefined reference toompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x67f): undefined reference to ompi_mpi_info_null' HPL_pdinfo.cpp:(.text+0x686): undefined reference toompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x8e2): undefined reference to ompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x8e7): undefined reference toompi_mpi_op_max' HPL_pdinfo.cpp:(.text+0x955): undefined reference to ompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x969): undefined reference toompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0xa5e): undefined reference to ompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0xd1e): undefined reference toompi_mpi_comm_world' HPL_pdinfo.cpp:(.text+0x28cb): undefined reference to ompi_mpi_comm_world' CMakeFiles/rochpl.dir/src/HPL_pdinfo.cpp.o:HPL_pdinfo.cpp:(.text+0x28fe): more undefined references toompi_mpi_comm_world' follow

The mpi I specified is mpich3. I'm trying to find where rocHPL uses the ompi, but I find nothing

noelchalmers commented 1 year ago

We can support any MPI, you just need to pass the location to the MPI install via the --with-mpi= flag to the installer. CMake will find the MPI libraries in that location and link them.

That said, the MPI used must be "GPU-aware", meaning built with ROCm support. This allows us to pass pointer allocated with hipMalloc directly to MPI calls, and allows MPI to use fast GPU<->GPU or GPU<->NIC links under the hood. If your MPI doesn't support this, you'll likely encounter faults at runtime.