llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.16k stars 12.03k forks source link

[AMDGPU] clang misreports HIP installation path as `/usr/local` when it is installed in `/usr`. #78344

Closed AngryLoki closed 9 months ago

AngryLoki commented 10 months ago

clang misreports HIP installation path as /usr/local when it is installed in /usr (in Gentoo).

Context: CMake calls clang -v -print-targets to get HIP/ROCm root: CMakeDetermineHIPCompiler.cmake#L151

$ clang -v -print-targets -print-rocm-search-dirs 
clang version 17.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/17/bin
Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg
System configuration file directory: /etc/clang
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/13
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
ROCm installation search path: /usr/lib/llvm/17
ROCm installation search path: /usr/lib/llvm/17
ROCm installation search path: /usr/lib/llvm/17/bin/../../../../lib/clang/17
ROCm installation search path: /opt/rocm
ROCm installation search path: /usr/local
ROCm installation search path: /usr
Found HIP installation: /usr/local, version 5.7.31921
...

And fails with:

CMake Error at /usr/share/cmake/Modules/CMakeDetermineHIPCompiler.cmake:217 (message):
  The ROCm root directory:

   /usr/local

  does not contain the HIP runtime CMake package, expected at one of:

   /usr/local/lib/cmake/hip-lang/hip-lang-config.cmake
   /usr/local/lib64/cmake/hip-lang/hip-lang-config.cmake

Call Stack (most recent call first):
  Applications/bitonic_sort/CMakeLists.txt:37 (enable_language)

Meanwhile, /usr/local is empty (docs)

$ du /usr/local
0       /usr/local/bin
0       /usr/local/lib
0       /usr/local/sbin
0       /usr/local/lib64
0       /usr/local

In Gentoo HIP/ROCm is installed into /usr. Version file is installed into /usr/share/hip/version. HIP runtime is detected by https://github.com/llvm/llvm-project/blob/bbb3f9b1321176a012dac7d50da54b869a85e9bc/clang/lib/Driver/ToolChains/AMDGPU.cpp#L478-L492 It checks ROCm installation search paths, at some point it checks /usr/local (before /usr), then according to the code finds /usr/local/../share/hip/version and completes the search.

I examined clang code and I don't see any way to manipulate clang to report the correct path. There is ROCm installation search path: /usr, but it is: 1) checked after /usr/local and 2) checked by trying to find a file /usr/../share/hip/version, which does not exist.

I'm aware that it is possible to patch CMakeDetermineHIPCompiler.cmake or set env vars HIP_PATH, ROCM_PATH, etc, but I want to see correct path detection out-of-the-box without patches or extra environment variables (maybe with CMake option like -DCLANG_ROCM_DIR=/usr for Gentoo Clang ebuild maintainers).

Could you revisit ROCm detection code and provide any solution? Thanks!

llvmbot commented 10 months ago

@llvm/issue-subscribers-clang-driver

Author: None (AngryLoki)

clang misreports HIP installation path as `/usr/local` when it is installed in `/usr` (in Gentoo). Context: CMake calls `clang -v -print-targets` to get get HIP/ROCm root: [CMakeDetermineHIPCompiler.cmake#L151](https://github.com/Kitware/CMake/blob/v3.28.1/Modules/CMakeDetermineHIPCompiler.cmake#L151) ``` $ clang -v -print-targets -print-rocm-search-dirs clang version 17.0.6 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm/17/bin Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg System configuration file directory: /etc/clang Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/13 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Selected multilib: .;@m64 ROCm installation search path: /usr/lib/llvm/17 ROCm installation search path: /usr/lib/llvm/17 ROCm installation search path: /usr/lib/llvm/17/bin/../../../../lib/clang/17 ROCm installation search path: /opt/rocm ROCm installation search path: /usr/local ROCm installation search path: /usr Found HIP installation: /usr/local, version 5.7.31921 ... ``` Meanwhile, `/usr/local` is empty ([docs](https://devmanual.gentoo.org/general-concepts/filesystem/index.html)) ```sh $ du /usr/local 0 /usr/local/bin 0 /usr/local/lib 0 /usr/local/sbin 0 /usr/local/lib64 0 /usr/local ``` In Gentoo HIP/ROCm is installed into `/usr`. Version file is installed into `/usr/share/hip/version`. HIP runtime is detected by https://github.com/llvm/llvm-project/blob/bbb3f9b1321176a012dac7d50da54b869a85e9bc/clang/lib/Driver/ToolChains/AMDGPU.cpp#L478-L492 It checks ROCm installation search paths, at some point it checks `/usr/local` (before `/usr`), then according to the code finds `/usr/local/../share/hip/version` and completes the search. I examined clang code and I don't see any way to manipulate clang to report the correct path. There is `ROCm installation search path: /usr`, but it is: 1) checked after `/usr/local` and 2) checked by trying to find a file `/usr/../share/hip/version`, which does not exist. I'm aware that it is possible to patch CMakeDetermineHIPCompiler.cmake or set env vars HIP_PATH, ROCM_PATH, etc, but I want to see correct path detection out-of-the-box without patches or extra environment variables (maybe with CMake option like `-DCLANG_ROCM_DIR=/usr` for Gentoo Clang maintainer). Could you revisit ROCm detection code and provide any solution? Thanks!
AngryLoki commented 10 months ago

Tagging @yxsamliu, @cgmb and @Artem-B as related AMDGPU.cpp editors.

yxsamliu commented 10 months ago

I think we could fix clang not to check .. for /usr/local since that should be /usr

AngryLoki commented 10 months ago

@yxsamliu , I agree with that.

Regarding /usr/local, I'm not even sure which distro installs hip libraries there. But if some distro decides to copy CUDA layout, it will be /usr/local/hip-X.Y/bin, /usr/local/hip-X.Y/lib64, /usr/local/hip-X.Y/share/cmake, and so on.

cgmb commented 10 months ago

I wonder if this might be why Fedora packagers have been complaining about having to set environment variables to ensure HIP is detected (cc. @trixirt). HIP on Debian is still on clang-15 and are using their own patches for finding HIP in /usr, so they haven't encountered this problem yet.

No distros install to /usr/local, because /usr/local is for packages that are not managed by the system. However, the proper behaviour is pretty well-defined. It should be treated the same as /usr, but be a higher priority in the search.

I think the only time when searching ../share/hip/version is relevant is when the installation prefix is /opt/rocm.*/llvm. Could we maybe just guard that the addition of that suffix so it's only included in the search list for that specific candidate install path?

yxsamliu commented 9 months ago

Limiting ../share/hip/version to /opt/rocm/hip* may cause regressions for situations like my_rocm_build/my_hip_build. It would be safer to exclude ../share/hip/version from /usr/local.