ROCm / ROCm-CompilerSupport

The compiler support repository provides various Lightning Compiler related services.
47 stars 31 forks source link

Strange -resource-dir and -internal-isystem when executing compile_hip_test_in_process #48

Closed littlewu2508 closed 5 months ago

littlewu2508 commented 1 year ago

I was packaging this repo as dev-libs/rocm-comgr based on vanilla clang for Gentoo, and met a series of problem.

Since the clang include directory in Gentoo is located at /usr/lib/clang/CLANG_VERSION/include, while llvm install prefix is at /usr/lib/llvm/LLVM_VERSION, so assuming clang include directory is under "lib/clang" in llvm prefix does not work. Since llvm/clang may bump patch version (e.g. 14.0.5->14.0.6), this path may change, so hard coding is not a good idea, either.

Believing clang could handle its -internal-isystem well, I removed 4 lines of explicit -isystem. But compile_hip_test_in_process turns out to fail.

After comparison with hipcc, I found out the problem may originated from the incorrect -resource-dir and -internal-isystem from the clang driver.

Running AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_EMIT_VERBOSE_LOGS=1 ./compile_hip_test_in_process, it gave:

amd_comgr_do_action:
          ActionKind: AMD_COMGR_ACTION_COMPILE_SOURCE_TO_BC
             IsaName: amdgcn-amd-amdhsa--gfx906
             Options: ""
                Path:
            Language: AMD_COMGR_LANGUAGE_HIP
COMGR::executeInProcessDriver argv: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists" "-clear-ast-before-backend" "-main
-file-name" "source2.hip" "-mrelocation-model" "pic" "-pic-level" "1" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x
86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility" "hidden" "-fapply-global-visibility-to-externs" "-target-cpu" "gfx906" "-mllvm
" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-resource-dir" "../../../../lib/clang/14.0.5" "-internal-isystem" "../../../../lib/clang/14.0.5" "-idirafter" "/usr/lib/llvm/
14/include" "-I" "/tmp/comgr-f6164d/include" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/
g++-v11/x86_64-pc-linux-gnu" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11/backward" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11
" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11/x86_64-pc-linux-gnu" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11/backward" "-int
ernal-isystem" "../../../../lib/clang/14.0.5/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/../../../../x86_64-pc-linux-gnu/incl
ude" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-internal-isystem" "../../../../lib/clang/14.0.5/include" "-internal-isystem" "/usr/local/include" "-int
ernal-isystem" "/usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-std=c++11"
"-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=/tmp/portage/dev-libs/rocm-comgr-5.1.3/work/ROCm-CompilerSupport-rocm-5.1.3/lib/comgr_build/test" "-ferror-limit" "19" "-fhip-new-
launch-api" "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-cuid=b4d7cb49db9450ba" "-fcuda-allow-variadic-functions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1
" "-o" "/tmp/comgr-f6164d/output/source2.hip.bc" "-x" "hip" "/tmp/comgr-f6164d/input/source2.hip"
In file included from /tmp/comgr-f6164d/input/source2.hip:38:
In file included from /usr/include/hip/hip_runtime.h:49:
/usr/include/stdio.h:33:10: fatal error: 'stddef.h' file not found
#include <stddef.h>
         ^~~~~~~~~~
1 error generated when compiling for gfx906.
        ReturnStatus: AMD_COMGR_STATUS_ERROR

FAILED: amd_comgr_do_action
 REASON: ERROR

As you can see, clearly there is incorrect "-resource-dir" "../../../../lib/clang/14.0.5" "-internal-isystem" "../../../../lib/clang/14.0.5" in the compiler arguments. The paths is clearly a relative path to clang executable, located at /usr/lib/llvm/14/bin. If I append this prefix, this argument changes to "-resource-dir" "/usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5" "-internal-isystem" "/usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5", then everything works.

Calling hipcc (or clang directly) from command-line does not has such problem. For example with HIPCC_VERBOSE=1 hipcc --offload-arch=gfx1031 main.cpp, I got

/usr/lib/llvm/14/bin/clang++  -std=c++11 -isystem "/usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5/include/.." --offload-arch=gfx1031 -O3 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false --rocm-path="/usr" --hip-device-lib-path="/usr/lib/amdgcn/bitcode" -fhip-new-launch-api  -L"/usr/lib64" -O3 -lgcc_s -lgcc -lpthread -lm -lrt  -Wl,--enable-new-dtags -lamdhip64  -x hip main.cpp -x hip

You can see there is no -isystem "/usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5/include". By --verbose, I got

"/usr/lib/llvm/14/bin/clang-14" -cc1 -triple amdgcn-amd-amdhsa -aux-triple x86_64-pc-linux-gnu -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -main-file-name main.cpp
-mrelocation-model pic -pic-level 1 -fhalf-no-semantic-interposition -mframe-pointer=none -fno-rounding-math -mconstructor-aliases -aux-target-cpu x86-64 -fcuda-is-device -mllvm -amdgpu-intern
alize-symbols -fcuda-allow-variadic-functions -fvisibility hidden -fapply-global-visibility-to-externs -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/hip.bc -mlink-builtin-bitcode /usr/lib/amd
gcn/bitcode/ocml.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/ockl.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/oclc_daz_opt_off.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/ocl
c_unsafe_math_off.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/oclc_finite_only_off.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc -mlink-builtin-bitc
ode /usr/lib/amdgcn/bitcode/oclc_wavefrontsize64_off.bc -mlink-builtin-bitcode /usr/lib/amdgcn/bitcode/oclc_isa_version_1031.bc -target-cpu gfx1031 -mllvm -treat-scalable-fixed-error-as-warnin
g -debugger-tuning=gdb -v -resource-dir /usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5 -internal-isystem /usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5 -idirafter /usr/include -internal-
isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11 -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11/x86_64-pc-linux-gnu -internal-isystem /usr/lib/gcc/x86
_64-pc-linux-gnu/11.3.0/include/g++-v11/backward -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11 -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++
-v11/x86_64-pc-linux-gnu -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/include/g++-v11/backward -internal-isystem /usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5/include -interna
l-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/../../../../x86_64-pc-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr
/include -internal-isystem /usr/lib/llvm/14/bin/../../../../lib/clang/14.0.5/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-pc-linux-gnu/11.3.0/../../../../
x86_64-pc-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=c++11 -fdeprecated-macro -fno-autolink -fdebug-compilation-dir=/data/wuyy/test_hi
p -ferror-limit 19 -fhip-new-launch-api -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm -amdg
pu-function-calls=false -cuid=fbe6e7b380fa0642 -fcuda-allow-variadic-functions -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/main-6f0332/main-gfx1031.o -x hip main.cpp

which shows clang got the correct -resource-dir and -internal-isystem.

To find out the problem, I build clang and rocm-comgr with debug symbols and -O0. By gdb ./compile_hip_test_in_process, I found the incomplete -resource-dir is being read out at CmdArgs.push_back(D.ResourceDir.c_str());. I did the same with gdb --args /usr/lib/llvm/14/bin/clang++ ...... and the resource dir is correct.

So in conclusion, the clang driver called from ROCm-CompilerSupport seems to have incomplete -resource-dir and -internal-isystem, while clang itself does not have the issue. I would like to know why, and how to resolve this, which I think is the best approach for packaging.

The ROCm-CompilerSupport version I'm packaging is ROCm-5.1.3.