lamikr / rocm_sdk_builder

Other
113 stars 8 forks source link

build failed: MIOpen: Linking error with undefined reference in libamdhip64.so #35

Closed Stefan-Olt closed 3 weeks ago

Stefan-Olt commented 4 weeks ago

I'm currently trying to build for RX6700 on Linux Mint (Ubuntu 22.04 LTS), unfortunately I'll get a linking error with MIOpenDriver

[100%] Linking CXX executable ../bin/MIOpenDriver
cd /home/stefan/source/rocm_sdk_builder/builddir/034_miopen/driver && /home/stefan/.local/lib/python3.10/site-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/MIOpenDriver.dir/link.txt --verbose=1
/opt/rocm_sdk_611/bin/clang++ -O3 -DNDEBUG -s -L/opt/rocm_sdk_611/lib64 -L/opt/rocm_sdk_611/lib -L/opt/rocm_sdk_611/hsa/lib -L/opt/rocm_sdk_611/rocblas/lib -L/opt/rocm_sdk_611/hcc/lib -pthread CMakeFiles/MIOpenDriver.dir/main.cpp.o CMakeFiles/MIOpenDriver.dir/InputFlags.cpp.o -o ../bin/MIOpenDriver  -Wl,-rpath,/home/stefan/source/rocm_sdk_builder/builddir/034_miopen/lib:/opt/rocm/lib: ../lib/libMIOpen.so.1.0.60101 --hip-link --offload-arch=gfx1031 /opt/rocm_sdk_611/lib64/libamd_comgr.so.2.7.60101 /opt/rocm_sdk_611/lib64/librocblas.so.4.1.60101 /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d /opt/rocm_sdk_611/lib/clang/17/lib/linux/libclang_rt.builtins-x86_64.a /opt/rocm_sdk_611/lib/libboost_filesystem.a /usr/lib/x86_64-linux-gnu/librt.a /opt/rocm/lib/libroctx64.so 
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_address_reserve@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_address_free@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_handle_create@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_unmap@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_set_access@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_map@ROCR_1'
/usr/bin/ld: /opt/rocm_sdk_611/lib64/libamdhip64.so.6.1.40092-6d684796d: undefined reference to `hsa_amd_vmem_get_access@ROCR_1'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [driver/CMakeFiles/MIOpenDriver.dir/build.make:121: bin/MIOpenDriver] Fehler 1
make[2]: Verzeichnis „/home/stefan/source/rocm_sdk_builder/builddir/034_miopen“ wird verlassen
make[1]: *** [CMakeFiles/Makefile2:12331: driver/CMakeFiles/MIOpenDriver.dir/all] Fehler 2
make[1]: Verzeichnis „/home/stefan/source/rocm_sdk_builder/builddir/034_miopen“ wird verlassen
make: *** [Makefile:166: all] Fehler 2
build failed: MIOpen
  error in build cmd: make VERBOSE=1 -j12

build failed

Any idea what could cause that? Is there some library missing in the linking command? Or was the libamdhip64.so not correctly build?

lamikr commented 3 weeks ago

It looks like that the libhsa-runtime64.so is not tried to link for some reason on Mint 22.04 as it provides those methods. ROCK runtime is build by binfo/007_01_rocr-runtime.binfo and it installs these files for me.

/opt/rocm_sdk_611/lib64/libhsa-runtime64.so.1.13.0
/opt/rocm_sdk_611/lib64/libhsa-runtime64.so.1
/opt/rocm_sdk_611/lib64/libhsa-runtime64.so
/opt/rocm_sdk_611/lib64/libhsa-runtime64.so.1.13.0
/opt/rocm_sdk_611/lib64/libhsa-runtime64.so.1
/opt/rocm_sdk_611/lib64/libhsa-runtime64.so
/opt/rocm_sdk_611/share/doc/hsa-runtime64/LICENSE.md
/opt/rocm_sdk_611/include/hsa
/opt/rocm_sdk_611/include/hsa/hsa_ven_amd_loader.h
/opt/rocm_sdk_611/include/hsa/amd_hsa_common.h
/opt/rocm_sdk_611/include/hsa/hsa_api_trace.h
/opt/rocm_sdk_611/include/hsa/amd_hsa_kernel_code.h
/opt/rocm_sdk_611/include/hsa/amd_hsa_queue.h
/opt/rocm_sdk_611/include/hsa/hsa_ext_image.h
/opt/rocm_sdk_611/include/hsa/amd_hsa_elf.h
/opt/rocm_sdk_611/include/hsa/amd_hsa_signal.h
/opt/rocm_sdk_611/include/hsa/hsa_ext_finalize.h
/opt/rocm_sdk_611/include/hsa/hsa_ext_amd.h
/opt/rocm_sdk_611/include/hsa/hsa_ven_amd_aqlprofile.h
/opt/rocm_sdk_611/include/hsa/hsa_amd_tool.h
/opt/rocm_sdk_611/include/hsa/Brig.h
/opt/rocm_sdk_611/include/hsa/hsa.h
/opt/rocm_sdk_611/lib64/cmake/hsa-runtime64/hsa-runtime64Targets.cmake
/opt/rocm_sdk_611/lib64/cmake/hsa-runtime64/hsa-runtime64Targets-release.cmake
/opt/rocm_sdk_611/lib64/cmake/hsa-runtime64/hsa-runtime64-config.cmake
/opt/rocm_sdk_611/lib64/cmake/hsa-runtime64/hsa-runtime64-config-version.cmake
lamikr commented 3 weeks ago

Could you try to add following to

src_projects/MIOpen/src/CMakeLists.txt starting from lines 873

find_library(LIBHSARUNTIME hsa-runtime64)
if(LIBHSARUNTIME)
    MESSAGE(STATUS "hsa-runtime64: " ${LIBHSARUNTIME})
    target_internal_library(MIOpen ${LIBHSARUNTIME})
else()
    target_link_libraries(MIOpen PRIVATE hsa-runtime64)
endif()

Then you need to delete the builddir/034_miopen directory to make sure that it does the configure again before building it. And build then with

./babs.sh -ba

Stefan-Olt commented 3 weeks ago

Thanks for your fast response, unfortunately that didn't help, no difference. I had a look what functions the hsa-runtime64 has and found for example this entry hsa_amd_vmem_map@@ROCR_1, while it tries to link to hsa_amd_vmem_map@ROCR_1. It seems there was build error introducing a second @ in the hip-runtime function version. It's always @@ROCR_1, but all other version information is just a single @. Any idea what caused that and how to fix that?

lamikr commented 3 weeks ago

Hmm, I have seen a situation where linking of some component failed because of LD varning about multiple VERSION tags. I changed the linker on that project to llvm's one and it fixed the issue. Wondering could this be related to same issue.

lamikr commented 3 weeks ago

If you are not able to find the issue, I will at some point try your os version.

Stefan-Olt commented 3 weeks ago

How did you change the linker? I compared to the AMD build, there it's also two @ in the hsa-runtime64 library, so it's probably a linking issue with MIOpen.

Thanks for trying my OS, bur Linux Mint is based on Ubuntu LTS (currently still 22.04), but new version based on 24.04 will be out in a few weeks, so if it works on Ubuntu 24.04 it should work on the next Linux Mint as well, and as the new version will be out soon, I don't think it's worth much effort to get the old version working (unless there is specific use case for using Ubuntu 22.04)

Stefan-Olt commented 3 weeks ago

I was able to fix this by using the mold linker (I compiled/installed the latest version from source), I added this to the BINFO_APP_CMAKE_CFG in the 034_miopen.binfo: -DCMAKE_EXE_LINKER_FLAGS='-fuse-ld=mold' -DCMAKE_SHARED_LINKER_FLAGS='-fuse-ld=mold' This will of course only work when mold is installed

lamikr commented 3 weeks ago

Thank you for the feedback and great that you found the workaround!

I think needed to change the linked to lld in "016_03_llvm_project_openmp.binfo". It has line:

BINFO_APP_CMAKE_CFG="${BINFO_APP_CMAKE_CFG} -DCMAKE_SHARED_LINKER_FLAGS=-fuse-ld=lld"

Stefan-Olt commented 3 weeks ago

I had the same issue with rocWMMA, same fix worked

lamikr commented 3 weeks ago

Great. Did you tested with '-fuse-ld=lld' or with '-fuse-ld=mold'

Stefan-Olt commented 3 weeks ago

Tested both and both work

lamikr commented 3 weeks ago

Thanks for confirming. After fresh install of Mint 21, I first did apt update apt upgrade

and then I installed same dependencies than for ubuntu in install_deps.sh

First build break came on package 010_01_rocPRIM which failed to find cmath. Seems to be known problem with default libstdc++11-dev and solution was to do "sudo apt install libstdc++-12-dev" to get newer version.

https://stackoverflow.com/questions/22752000/clang-cmath-file-not-found

Stefan-Olt commented 3 weeks ago

Yes, I think I have installed that in the past, therefore I did not came accross that problem

lamikr commented 3 weeks ago

Another issue now encountered was similar. By default there were libgfortran-11-dev and gfortran-11 and then the libgfortran library could not be found. But "sudo apt install libgfortran-12-dev gfortran-12" solved that.

So for ubuntu and mint there is seems to be need to have check that if mint version is 21 or ubuntu version is 22.04 then sudo apt install libgfortran-12-dev gfortran-12 libstdc++-12-dev in addition to other dependencies

lamikr commented 3 weeks ago

This shoud be fixed by pull request https://github.com/lamikr/rocm_sdk_builder/pull/59