ROCm / clr

MIT License
85 stars 35 forks source link

[Issue]: amd_math_functions.h is missing many math functions like `max`? #82

Open littlewu2508 opened 1 month ago

littlewu2508 commented 1 month ago

Problem Description

When running test suite for comgr, 4 tests failed:

         32 - comgr_compile_source_to_executable (Failed)
         34 - comgr_compile_hip_test_in_process (Failed)
         36 - comgr_compile_hip_to_relocatable (Failed)
         37 - comgr_mangled_names_hip_test (Failed)

Take compile_hip_to_relocatable as example:

AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_EMIT_VERBOSE_LOGS=1 /fast/portage/dev-libs/rocm-comgr-6.1.1/work/llvm-project-rocm-6.1.1/amd/comgr_build/test/compile_hip_to_relocatable
amd_comgr_do_action:
          ActionKind: AMD_COMGR_ACTION_COMPILE_SOURCE_TO_RELOCATABLE
             IsaName: amdgcn-amd-amdhsa--gfx906
             Options: "-mllvm" "-amdgpu-early-inline-all" "-fno-slp-vectorize"
                Path:
            Language: AMD_COMGR_LANGUAGE_HIP
 Comgr Branch-Commit: not-available-not-available
         LLVM Commit:
    Compilation Args:  "--offload-arch=gfx906" "-c" "-fhip-emit-relocatable" "-mllvm" "-amdgpu-internalize-symbols" "-I" "/tmp/comgr-dde422/include" "-x" "hip" "-std=c++11" "-target" "x86_64-unknown-linux-gnu" "--cuda-device-only" "--rocm-path=/tmp/comgr-dde422/rocm" "-mllvm" "-amdgpu-early-inline-all" "-fno-slp-vectorize" "-save-temps=/tmp/comgr-dde422/output" "/tmp/comgr-dde422/input/source1.hip" "-o" "/tmp/comgr-dde422/output/source1.hip.o"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-E" "-save-temps=/tmp/comgr-dde422/output" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.hip" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/hip.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_isa_version_906.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx906" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-idirafter" "/tmp/comgr-dde422/rocm/include" "-include" "/opt/gentoo/usr/include/gentoo/fortify.h" "-include" "/opt/gentoo/usr/include/gentoo/maybe-stddefs.h" "-I" "/tmp/comgr-dde422/include" "-isysroot" "/opt/gentoo" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/x86_64-pc-linux-gnu" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/backward" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/x86_64-pc-linux-gnu" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/backward" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include" "-internal-isystem" "/opt/gentoo/usr/local/include" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/opt/gentoo/include" "-internal-externc-isystem" "/opt/gentoo/usr/include" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include" "-internal-isystem" "/opt/gentoo/usr/local/include" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/opt/gentoo/include" "-internal-externc-isystem" "/opt/gentoo/usr/include" "-std=c++11" "-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=/fast/gentoo/repos/gentoo/dev-libs/rocm-comgr" "-ferror-limit" "19" "-fhip-new-launch-api" "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-mllvm" "-amdgpu-internalize-symbols" "-mllvm" "-amdgpu-early-inline-all" "-cuid=6bf4e73c86cb8f57" "-fcuda-allow-variadic-functions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx906.hipi" "-x" "hip" "/tmp/comgr-dde422/input/source1.hip"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists" "-save-temps=/tmp/comgr-dde422/output" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.hip" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/hip.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitc
ode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_isa_version_906.bc" "-mlink-builtin-bitcode" "/tmp/comgr-dde422/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx906" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-std=c++11" "-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=/fast/gentoo/repos/gentoo/dev-libs/rocm-comgr" "-ferror-limit" "19" "-fhip-new-launch-api" "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-mllvm" "-amdgpu-internalize-symbols" "-mllvm" "-amdgpu-early-inline-all" "-disable-llvm-passes" "-cuid=6bf4e73c86cb8f57" "-fcuda-allow-variadic-functions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx906.bc" "-x" "hip-cpp-output" "source1-hip-amdgcn-amd-amdhsa-gfx906.hipi"
In file included from /tmp/comgr-dde422/input/source1.hip:38:
In file included from /opt/gentoo/usr/include/hip/hip_runtime.h:62:
In file included from /opt/gentoo/usr/include/hip/amd_detail/amd_hip_runtime.h:389:
/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include/__clang_cuda_complex_builtins.h:194:30: error: use of undeclared identifier 'max'; did you mean 'fmax'?
  194 |   double __logbw = std::logb(max(std::abs(__c), std::abs(__d)));
      |                              ^
/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include/__clang_cuda_math_forward_declares.h:73:81: note: 'fmax' declared here
   73 | static __inline__ __attribute__((always_inline)) __attribute__((device)) double fmax(double, double);
      |                                                                                 ^
In file included from /tmp/comgr-dde422/input/source1.hip:38:
In file included from /opt/gentoo/usr/include/hip/hip_runtime.h:62:
In file included from /opt/gentoo/usr/include/hip/amd_detail/amd_hip_runtime.h:389:
/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include/__clang_cuda_complex_builtins.h:227:29: error: use of undeclared identifier 'max'; did you mean 'fmax'?
  227 |   float __logbw = std::logb(max(std::abs(__c), std::abs(__d)));
      |                             ^
/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include/__clang_cuda_math_forward_declares.h:74:80: note: 'fmax' declared here
   74 | static __inline__ __attribute__((always_inline)) __attribute__((device)) float fmax(float, float);
      |                                                                                ^
2 errors generated when compiling for gfx906.
        ReturnStatus: AMD_COMGR_STATUS_ERROR

FAILED: amd_comgr_do_action
 REASON: ERROR

Similar issue is found in https://github.com/iree-org/iree/issues/16899

It seems that the missing max function in amd_math_functions.h causes this. https://github.com/ROCm/clr/commit/d7d0f1131882ea1f42b7c42235b66d88cd9305a1 removes many lines of math functions in amd_math_functions.h. According to the commit message those functions are in hiprtc headers, but I cannot find them.

I add those functions back to amd_math_functions.h and 3 tests got passed. compile_source_to_executable still failed with no clear reason, the compilation is successful:

AMD_COMGR_SAVE_TEMPS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_EMIT_VERBOSE_LOGS=1 /fast/portage/dev-libs/rocm-comgr-6.1.1/work/llvm-project-rocm-6.1.1/amd/comgr_build/test/compile_source_to_executable
amd_comgr_do_action:
          ActionKind: AMD_COMGR_ACTION_COMPILE_SOURCE_TO_EXECUTABLE
             IsaName: amdgcn-amd-amdhsa--gfx900
             Options:
                Path:
            Language: AMD_COMGR_LANGUAGE_OPENCL_1_2
 Comgr Branch-Commit: not-available-not-available
         LLVM Commit:
    Compilation Args:  "-target" "amdgcn-amd-amdhsa" "-mcpu=gfx900" "-I" "/tmp/comgr-7234d0/include" "-x" "cl" "-std=cl1.2" "-cl-no-stdinc" "--rocm-path=/tmp/comgr-7234d0/rocm" "/tmp/comgr-7234d0/input/source1.cl" "-o" "/tmp/comgr-7234d0/output/source1.cl.so"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-emit-obj" "-mrelax-all" "-dumpdir" "/tmp/comgr-7234d0/output/source1.cl.so-" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.cl" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/opencl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_isa_version_900.bc" "-mlink-builtin-bitcode" "/tmp/comgr-7234d0/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx900" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-c-isystem" "/opt/gentoo/usr/lib/llvm/17/include/gpu-none-llvm" "-I" "/tmp/comgr-7234d0/include" "-isysroot" "/opt/gentoo" "-std=cl1.2" "-fdebug-compilation-dir=/home/wuyy" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fno-threadsafe-statics" "-fcolor-diagnostics" "-faddrsig" "-o" "/tmp/source1-f14cf8.o" "-x" "cl" "/tmp/comgr-7234d0/input/source1.cl"
     Driver Job Args: lld "/tmp/source1-f14cf8.o" "-plugin-opt=mcpu=gfx900" "--no-undefined" "-shared" "-o" "/tmp/comgr-7234d0/output/source1.cl.so"
        ReturnStatus: AMD_COMGR_STATUS_SUCCESS

amd_comgr_do_action:
          ActionKind: AMD_COMGR_ACTION_COMPILE_SOURCE_TO_EXECUTABLE
             IsaName: amdgcn-amd-amdhsa--gfx900
             Options:
                Path:
            Language: AMD_COMGR_LANGUAGE_HIP
 Comgr Branch-Commit: not-available-not-available
         LLVM Commit:
    Compilation Args:  "--offload-arch=gfx900" "-I" "/tmp/comgr-197228/include" "-x" "hip" "-std=c++11" "-target" "x86_64-unknown-linux-gnu" "--cuda-device-only" "--rocm-path=/tmp/comgr-197228/rocm" "-save-temps=/tmp/comgr-197228/output" "/tmp/comgr-197228/input/source1.hip" "-o" "/tmp/comgr-197228/output/source1.hip.so"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-E" "-dumpdir" "/tmp/comgr-197228/output/source1.hip.so-" "-save-temps=/tmp/comgr-197228/output" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.hip" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/hip.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_isa_version_900.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx900" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-idirafter" "/tmp/comgr-197228/rocm/include" "-include" "/opt/gentoo/usr/include/gentoo/fortify.h" "-include" "/opt/gentoo/usr/include/gentoo/maybe-stddefs.h" "-I" "/tmp/comgr-197228/include" "-isysroot" "/opt/gentoo" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/x86_64-pc-linux-gnu" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/backward" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/x86_64-pc-linux-gnu" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/backward" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include" "-internal-isystem" "/opt/gentoo/usr/local/include" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/opt/gentoo/include" "-internal-externc-isystem" "/opt/gentoo/usr/include" "-internal-isystem" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17/include" "-internal-isystem" "/opt/gentoo/usr/local/include" "-internal-isystem" "/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/include" "-internal-externc-isystem" "/opt/gentoo/include" "-internal-externc-isystem" "/opt/gentoo/usr/include" "-std=c++11" "-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=/home/wuyy" "-ferror-limit" "19" "-fhip-new-launch-api" "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-cuid=eb894c2ff96f62a8" "-fcuda-allow-variadic-functions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx900.hipi" "-x" "hip" "/tmp/comgr-197228/input/source1.hip"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists" "-dumpdir" "/tmp/comgr-197228/output/source1.hip.so-" "-save-temps=/tmp/comgr-197228/output" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.hip" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/hip.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_isa_version_900.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx900" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-std=c++11" "-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=/home/wuyy" "-ferror-limit" "19" "-fhip-new-launch-api" "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-disable-llvm-passes" "-cuid=eb894c2ff96f62a8" "-fcuda-allow-variadic-functions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx900.bc" "-x" "hip-cpp-output" "source1-hip-amdgcn-amd-amdhsa-gfx900.hipi"
     Driver Job Args: clang "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu" "-S" "-dumpdir" "/tmp/comgr-197228/output/source1.hip.so-" "-save-temps=/tmp/comgr-197228/output" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "source1.hip" "-mrelocation-model" "pic" "-pic-level" "2" "-fhalf-no-semantic-interposition" "-mframe-pointer=all" "-fno-rounding-math" "-mconstructor-aliases" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-amdgpu-internalize-symbols" "-fcuda-allow-variadic-functions" "-fvisibility=hidden" "-fapply-global-visibility-to-externs" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/hip.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ocml.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/ockl.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_finite_only_off.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_isa_version_900.bc" "-mlink-builtin-bitcode" "/tmp/comgr-197228/rocm/amdgcn/bitcode/oclc_abi_version_400.bc" "-target-cpu" "gfx900" "-debugger-tuning=gdb" "-resource-dir" "/opt/gentoo/usr/lib/llvm/17/bin/../../../../lib/clang/17" "-std=c++11" "-fno-autolink" "-fdebug-compilation-dir=/home/wuyy" "-ferror-limit" "19" "-fhip-new-launch-api" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-cuid=eb894c2ff96f62a8" "-fcuda-allow-variadic-functions" "-faddrsig" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx900.s" "-x" "ir" "source1-hip-amdgcn-amd-amdhsa-gfx900.bc"
     Driver Job Args: clang "-cc1as" "-triple" "amdgcn-amd-amdhsa" "-filetype" "obj" "-main-file-name" "source1.hip" "-target-cpu" "gfx900" "-I" "/tmp/comgr-197228/include" "-fdebug-compilation-dir=/home/wuyy" "-dwarf-version=5" "-mrelocation-model" "pic" "-mrelax-all" "-o" "source1-hip-amdgcn-amd-amdhsa-gfx900.o" "source1-hip-amdgcn-amd-amdhsa-gfx900.s"
        ReturnStatus: AMD_COMGR_STATUS_ERROR

FAILED: amd_comgr_do_action
 REASON: ERROR

Operating System

Gentoo Prefix on upstream Linux kernel 6.6.13

CPU

AMD Ryzen 7 7700

GPU

AMD Radeon RX 7900 XT

ROCm Version

ROCm 6.1.0

ROCm Component

clr

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

AngryLoki commented 1 month ago

Hi, I encountered this error too and gathered some extra information:

So you can launch with AMD_COMGR_SAVE_TEMPS=1, which will save temporary directory.

HSAKMT_DEBUG_LEVEL=7 AMD_LOG_LEVEL=1 AMD_COMGR_EMIT_VERBOSE_LOGS=1 AMD_COMGR_REDIRECT_LOGS=stdout AMD_COMGR_SAVE_TEMPS=1 /var/tmp/portage/dev-libs/rocm-comgr-6.1.0/work/llvm-project-rocm-6.1.0/amd/comgr_build/test/compile_hip_test_in_process

It outputs command, which after removing quotes looks like this:

clang --offload-arch=gfx906 -I /tmp/comgr-c9b0f7/include -x hip -std=c++11 -target x86_64-unknown-linux-gnu --cuda-device-only -isystem /usr/lib/clang/18 -c -emit-llvm --rocm-path=/tmp/comgr-c9b0f7/rocm  -save-temps=/tmp/comgr-c9b0f7/output /tmp/comgr-c9b0f7/input/source2.hip -o /tmp/comgr-c9b0f7/output/source2.hip.bc

Which fails with:

/usr/lib/llvm/18/bin/../../../../lib/clang/18/include/__clang_cuda_complex_builtins.h:227:29: error: use of undeclared identifier 'max'; did you mean 'fmax'?

Now a small trick, replace clang with HIPCC_VERBOSE=1 hipcc and it works. Why? By decimating hipcc command, you can see that the important flag was --hip-version=6.1.1.

Environment:

$ hipcc --version
HIP version: 6.1.40092-
clang version 18.1.5
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/18/bin
Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang++.cfg

$ clang --version
clang version 18.1.5
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/18/bin
Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg

$ strace -e trace=open,openat,access,stat,statx,lstat -f clang -v 2>&1 
| grep hipVersion
openat(AT_FDCWD, "/usr/lib/llvm/18/bin/.hipVersion", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/llvm/18/bin/.hipVersion", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/llvm/18/bin/../../../../lib/clang/18/bin/.hipVersion", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

So maybe possible reason (or part of) is that .hipVersion is unknown to clang. Also there is known issue https://github.com/llvm/llvm-project/issues/78344 regarding path autodetection:

$ clang -v -print-targets -print-rocm-search-dirs 2>&1 | grep -e "HIP"
Found HIP installation: /usr/local, version 6.1.40092

Adding --hip-version=6.1.1 equivalent adding -include __clang_hip_runtime_wrapper.h, which contains #include <__clang_hip_math.h> (and many other autoincluded headers), which contains template max.

AngryLoki commented 1 month ago

compile_source_to_executable still failed with no clear reason

That's a separate issue. If you add logArgv(LogS, "???", Argv); before https://github.com/ROCm/llvm-project/blob/rocm-6.1.1/amd/comgr/src/comgr-compiler.cpp#L757 you will see, that it fails on attempt to run job in executeInProcessDriver with... arguments of clang-offload-bundler... I don't know why! But it can be solved by disabling in-process compilation, which you can see in https://github.com/littlewu2508/gentoo/pull/3/files#diff-2c3851549d30124c76649584d8dddfa3ef07e522aafaf97ab4c49947509ea134

So no in-process compilation means that hipcc is called, which fixes both of issues. But issues are separate.