intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.22k stars 732 forks source link

[SYCL][CUDA] Build failed with fatal error "error in backend: Undefined external symbol "exp"" #7386

Open Soujanyajanga opened 1 year ago

Soujanyajanga commented 1 year ago

Description of the bug: SYCL compilation failing with below error for Quda "feature-sycl" branch QUDA-SYCL

fatal error: error in backend: Undefined external symbol "exp" llvm-foreach: clang-15: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 15.0.0 (https://github.com/intel/llvm.git 0c7a1e18978754451f5c2c95129721297e2c2411) Target: x86_64-unknown-linux-gnu Thread model: posix

To Reproduce

  1. git clone QUDA
  2. git checkout feature/sycl
  3. cd quda
  4. append CMAKE_CXX_FLAGS_xxx and CMAKE_C_FLAGS_xxx with "-fsycl -fsycl-targets=nvptx64-nvidia-cuda"
  5. mkdir build, cd build
  6. cmake ../ -DQUDA_TARGET_TYPE=SYCL -DCMAKE_CXX_COMPILER=clang++
  7. Build fails with fatal error

Environment:

Additional context Compilation is successful with "DPC++" compiler

bader commented 1 year ago

clang version 15.0.0 (https://github.com/intel/llvm.git 0c7a1e1)

This version is 5 months old. Could you check if the issue still exists in the recent version, please?

Additional context Compilation is successful with "DPC++" compiler

Could you clarify what "DPC++ compiler" means here? Do you mean dpcpp driver from Intel's oneAPI toolchain?

Soujanyajanga commented 1 year ago

clang version 15.0.0 (https://github.com/intel/llvm.git 0c7a1e1)

This version is 5 months old. Could you check if the issue still exists in the recent version, please?

Additional context Compilation is successful with "DPC++" compiler

Could you clarify what "DPC++ compiler" means here? Do you mean dpcpp driver from Intel's oneAPI toolchain?

Yes, its "dpcpp" from Intel's oneAPI toolchain.

zjin-lcf commented 1 year ago

@Soujanyajanga Is it sycl::exp() ?

Soujanyajanga commented 1 year ago

@Soujanyajanga Is it sycl::exp() ?

Yes, By changing the GIT community code to sycl::exp(), compilation issue is fixed.

Soujanyajanga commented 1 year ago

clang version 15.0.0 (https://github.com/intel/llvm.git 0c7a1e1)

This version is 5 months old. Could you check if the issue still exists in the recent version, please? With latest "clang" version compilation/build failing with "uses too much shared data (0x18000 bytes, 0xc000 max)" as pasted below:

_ptxas error : Entry function '_ZTSZZN4quda6launchINS_9Kernel3DSINS_11mobius_eofa12eofa_dslash5ENS2_10Dslash5ArgIdLi3ELb1ELb1ELb1ELNS_11Dslash5TypeE10EEELb0EEES6_EENSt9enable_ifIXclsr6deviceE14use_kernel_argIT0_EEE11qudaError_tE4typeERKNS_12qudaStream_tERN4sycl3_V18nd_rangeILi3EEERKS9_ENKUlRNSH_7handlerEE_clESO_EUlNSH_7nditemILi3EEEE' uses too much shared data (0x18000 bytes, 0xc000 max) llvm-foreach: clang-16: error: ptxas command failed with exit code 255 (use -v to see invocation) clang version 16.0.0_

Additional context Compilation is successful with "DPC++" compiler

Could you clarify what "DPC++ compiler" means here? Do you mean dpcpp driver from Intel's oneAPI toolchain?

Yes, its "dpcpp" from Intel's oneAPI toolchain.

clang version 15.0.0 (https://github.com/intel/llvm.git 0c7a1e1)

This version is 5 months old. Could you check if the issue still exists in the recent version, please?

Additional context Compilation is successful with "DPC++" compiler

Could you clarify what "DPC++ compiler" means here? Do you mean dpcpp driver from Intel's oneAPI toolchain?

Yes, its "dpcpp" from Intel's oneAPI toolchain.

zjin-lcf commented 1 year ago

Are the same amount of shared local memory allocated in your CUDA and SYCL programs ?

zjin-lcf commented 1 year ago

Did you see the following error ?

[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_iram.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_trlm.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/dirac_coarse.cpp.o
[  2%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_block_trlm.cpp.o
clang-15: error: clang-15: unknown argument: '-fhonor-nan-compares'
error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'
Soujanyajanga commented 1 year ago

Did you see the following error ?

[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_iram.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_trlm.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/dirac_coarse.cpp.o
[  2%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_block_trlm.cpp.o
clang-15: error: clang-15: unknown argument: '-fhonor-nan-compares'
error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'

YES, with latest version of code, above issue is observed.

Soujanyajanga commented 1 year ago

Did you see the following error ?

[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_iram.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_trlm.cpp.o
[  1%] Building CXX object lib/CMakeFiles/quda_cpp.dir/dirac_coarse.cpp.o
[  2%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_block_trlm.cpp.o
clang-15: error: clang-15: unknown argument: '-fhonor-nan-compares'
error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'
clang-15: error: unknown argument: '-fhonor-nan-compares'

YES, with latest version of code, above issue is observed.

Work around for the issue is comment lineno 106

This error is from file "quda/lib/targets/sycl/target_sycl.cmake" if("x${CMAKE_CXX_COMPILER_ID}" STREQUAL "xClang" OR 103 "x${CMAKE_CXX_COMPILER_ID}" STREQUAL "xIntelLLVM") 104 #target_compile_options(quda INTERFACE -fhonor-nan-compares) 105 #target_compile_options(quda PRIVATE -fhonor-nan-compares) 106 target_compile_options(quda PUBLIC -fhonor-nan-compares) >>>>>>>> as CLANG does not have support for this flag 107 target_compile_options(quda PUBLIC -Wno-tautological-constant-compare)

zjin-lcf commented 1 year ago

Thanks for highlighting the line.