KhronosGroup / SPIRV-LLVM-Translator

A tool and a library for bi-directional translation between SPIR-V and LLVM IR
Other
479 stars 213 forks source link

OpenCL (CLC++2021) compilation to SPIR-V fails #2193

Open davidrohr opened 11 months ago

davidrohr commented 11 months ago

I opened this issue as an LLVM issue first, but LLVM experts indicated the problem is with the SPIRV-LLVM-Translator: https://github.com/llvm/llvm-project/issues/68305 Thus I am opening it also here:

Compilation with clang 17.0.2 (on Gentoo Linux) fails with the below error message. I was using clang 15 before, which didn't fail. I also tried clang 16 now, which failed with the same error, I didn't try other versions.

I am attaching a tarball with my .cl file, and with the 2 files in /tmp that clang asked me to attach to the bug report: bugreport.tar.gz

Command to reproduce:

clang-17 -O0 --target=spirv64 -ferror-limit=1000 -Dcl_clang_storage_class_specifiers -Wno-invalid-constexpr -Wno-unused-command-line-argument -cl-std=CLC++2021 -Xclang -fdenormal-fp-math-f32=ieee -cl-mad-enable -cl-no-signed-zeros -c foo.cl -o foo.spirv

Error message:

llvm-spirv: /usr/lib/llvm/17/include/llvm/ADT/SmallVector.h:294: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](size_type) [with T = llvm::Type*; <template-parameter-1-2> = void; reference = llvm::Type*&; size_type = long unsigned int]: Assertion `idx < size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /usr/lib/llvm/17/bin/llvm-spirv /tmp/foo-21cdb2.bc -o /home/qon/foo.spirv
 #0 0x00007f161a086fae llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc86fae)
 #1 0x00007f161a084c44 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc84c44)
 #2 0x00007f161a084db6 (/usr/lib/llvm/17/lib64/libLLVM-17.so+0xc84db6)
 #3 0x00007f1618e641b0 (/lib64/libc.so.6+0x391b0)
 #4 0x00007f1618eb208c (/lib64/libc.so.6+0x8708c)
 #5 0x00007f1618e64112 gsignal (/lib64/libc.so.6+0x39112)
 #6 0x00007f1618e4d4f2 abort (/lib64/libc.so.6+0x224f2)
 #7 0x00007f1618e4d415 (/lib64/libc.so.6+0x22415)
 #8 0x00007f1618e5cd32 (/lib64/libc.so.6+0x31d32)
 #9 0x00007f16213da2bf SPIRV::BuiltinCallMutator::doConversion() (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1da2bf)
#10 0x00007f16213a03ef (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a03ef)
#11 0x00007f16213a90fb SPIRV::OCLToSPIRVBase::transBuiltin(llvm::CallInst*, OCLUtil::OCLBuiltinTransInfo&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a90fb)
#12 0x00007f16213aa280 SPIRV::OCLToSPIRVBase::visitCallBuiltinSimple(llvm::CallInst*, llvm::StringRef, llvm::StringRef) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1aa280)
#13 0x00007f16213b189c SPIRV::OCLToSPIRVBase::visitCallInst(llvm::CallInst&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1b189c)
#14 0x00007f16213a0841 SPIRV::OCLToSPIRVBase::runOCLToSPIRV(llvm::Module&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a0841)
#15 0x00007f16213a0b6e SPIRV::OCLToSPIRVPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x1a0b6e)
#16 0x00007f16214b021d (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x2b021d)
#17 0x00007f16214f4f27 (/usr/lib/llvm/17/lib64/libLLVMSPIRVLib.so.17+0x2f4f27)
#18 0x000055b379753efd (/usr/lib/llvm/17/bin/llvm-spirv+0x13efd)
#19 0x000055b37974d0c2 (/usr/lib/llvm/17/bin/llvm-spirv+0xd0c2)
#20 0x00007f1618e4eb8a (/lib64/libc.so.6+0x23b8a)
#21 0x00007f1618e4ec45 __libc_start_main (/lib64/libc.so.6+0x23c45)
#22 0x000055b37974d471 (/usr/lib/llvm/17/bin/llvm-spirv+0xd471)
clang-17: error: unable to execute command: Aborted
clang-17: error: llvm-spirv command failed due to signal (use -v to see invocation)
clang version 17.0.2
Target: spirv64
Thread model: posix
InstalledDir: /usr/lib/llvm/17/bin
clang-17: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-17: note: diagnostic msg: /tmp/foo-f70c1d.cl
clang-17: note: diagnostic msg: /tmp/foo-f70c1d.sh
clang-17: note: diagnostic msg: 

********************
davidrohr commented 6 months ago

For reference, I tried the same with clang 18.1 and spirv llvm transpator 18.1, and I am still getting the same error

MrSidims commented 6 months ago

@davidrohr thanks for the report and apologies for the long response. I have a feeling, that this issue relates with (quite funny) bug with handling "convert" functions, that was recently fixed in https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2443 . May I ask you to check it you example works on main branch? If so, @vmaksimo please backport your patch(es) to release branches.

davidrohr commented 6 months ago

Dear @MrSidims : I have just tried with the version llvm_release_180 branch (commit 1745c78f037645111593d0de527abd34d6885d32) with #2443 cherry-picked, and it still fails.

The testcase is attached. The error message is:

 #0 0x00007f6865fd17fe llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdd17fe)
 #1 0x00007f6865fcef34 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdcef34)
 #2 0x00007f6865fcf326 (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xdcf326)
 #3 0x00007f6864c641b0 (/lib64/libc.so.6+0x391b0)
 #4 0x00007f6864cb208c (/lib64/libc.so.6+0x8708c)
 #5 0x00007f6864c64112 gsignal (/lib64/libc.so.6+0x39112)
 #6 0x00007f6864c4d4f2 abort (/lib64/libc.so.6+0x224f2)
 #7 0x00007f6864c4d415 (/lib64/libc.so.6+0x22415)
 #8 0x00007f6864c5cd32 (/lib64/libc.so.6+0x31d32)
 #9 0x00007f686d3f039f SPIRV::BuiltinCallMutator::doConversion() (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1f039f)
#10 0x00007f686d3a31ef (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a31ef)
#11 0x00007f686d3a487d SPIRV::OCLToSPIRVBase::transBuiltin(llvm::CallInst*, OCLUtil::OCLBuiltinTransInfo&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a487d)
#12 0x00007f686d3a537e SPIRV::OCLToSPIRVBase::visitCallBuiltinSimple(llvm::CallInst*, llvm::StringRef, llvm::StringRef) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a537e)
#13 0x00007f686d3c8d0a SPIRV::OCLToSPIRVBase::visitCallInst(llvm::CallInst&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1c8d0a)
#14 0x00007f686d3a3e78 SPIRV::OCLToSPIRVBase::runOCLToSPIRV(llvm::Module&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a3e78)
#15 0x00007f686d3a40ee SPIRV::OCLToSPIRVPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x1a40ee)
#16 0x00007f686d4d2a0d (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x2d2a0d)
#17 0x00007f686617495f llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm/18/lib64/libLLVM.so.18.1+0xf7495f)
#18 0x00007f686d51415a (/usr/lib/llvm/18/lib64/libLLVMSPIRVLib.so.18.1+0x31415a)
#19 0x0000558f4efae602 (/usr/lib/llvm/18/bin/llvm-spirv+0x13602)
#20 0x0000558f4efa8c2c (/usr/lib/llvm/18/bin/llvm-spirv+0xdc2c)
#21 0x00007f6864c4eb8a (/lib64/libc.so.6+0x23b8a)
#22 0x00007f6864c4ec45 __libc_start_main (/lib64/libc.so.6+0x23c45)
#23 0x0000558f4efa93b1 (/usr/lib/llvm/18/bin/llvm-spirv+0xe3b1)
clang: error: unable to execute command: Aborted
clang: error: llvm-spirv command failed due to signal (use -v to see invocation)

testcase.tar.gz

vmaksimo commented 6 months ago

@davidrohr could you please also try this fix together with the one mentioned above? https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2464 If it doesn't help, I would ask you to attach *.bc input file for translator to have a chance to take a look at the bug closely. Thanks!

davidrohr commented 6 months ago

@vmaksimo : I just tried with #2464 in addition. Still fails in the same way as before.

I am attachiing again the clang testcase with the sources: testcase.tar.gz

And here is the .bc temporary file I obtain from clang using -emit-llvm --target=spir64-unknown-unknown instead of --target=spirv. bctestcase.tar.gz

davidrohr commented 5 months ago

@vmaksimo : Any progress on this? Sorry for bugging, but I am wondering what could be a timescale to get a fix?

vmaksimo commented 5 months ago

Hi @davidrohr! I was able to reproduce the issue and found out that the problem is in the translation of printf call. Unfortunately, no more progress yet. Any chance you could try not to use printf calls in your app as a hotfix? Timescale is ~3-4 weeks if no one else will take a look earlier (I'll be on vacation for nearest ~2 weeks).

davidrohr commented 5 months ago

Thx a lot. That was very helpful. Indeed that was a bug on our side. The printf should not have been there in the GPU version. (Although I understand that it should be supported in principle, so would be good to fix it :)).

I fixed it on our side, but now I am running into a different problem, which I have reported here https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/2531 :(.

davidrohr commented 5 months ago

For reference, after the fix mentioned in #2531 and avoiding the printf, the code compiles to spirv now. I am leaving this open until the printf problem is fixed as well.