KhronosGroup / SPIRV-LLVM-Translator

A tool and a library for bi-directional translation between SPIR-V and LLVM IR
Other
468 stars 209 forks source link

InvalidInstruction: Can't translate llvm instruction with LLVM 18 (working with 17) #2531

Closed davidrohr closed 4 months ago

davidrohr commented 4 months ago

The attached testcase fails in the spirv translation step with the following error:

InvalidInstruction: Can't translate llvm instruction:
 Global variable cannot have Function storage class. Consider setting a proper address space.
 Original LLVM value:
@constinit = private constant [5 x float] [float 0x3FC8B7A880000000, float 0x4011115E40000000, float 0x3F7567A360000000, float 0x40030CCC40000000, float 0x3FEF60B2C0000000], align 4

Steps to reproduce:

clang-18 -O0 -emit-llvm --target=spir64-unknown-unknown -ferror-limit=1000 -Dcl_clang_storage_class_specifiers -Wno-invalid-constexpr -Wno-unused-command-line-argument -cl-std=CLC++2021 -Xclang -fdenormal-fp-math-f32=ieee -cl-mad-enable -cl-no-signed-zeros -c testcase.cl -o testcase.bc
llvm-spirv testcase.bc -o testcase.spirv

This is using llvm/clang 18.1, and the latest 18-branch hash of the translator 259f72c06ce9dff3867f842aaeb1e414c97066a5. The same file compiles fine with llvm/clang 17 and the corresponding version of the translator.

I am not sure if this is a bug in the translator or in our code. But unfortunately, from the error message I cannot understand which variable is reported to have the wrong address space.

MrSidims commented 4 months ago

This is using llvm/clang 18.1

To clarify, are you compiling OpenCL code or something else? Just for C/C++ clang would not emit an address space for GV. UPD: ah, I missed testcase.cl from the reproducer. It's a bit odd, that OpenCL compiler hasn't emitted Constant or Global address space in your case.

Discussion here would be helpful: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2267 , but basically we have to ask an address space for GVs, otherwise absence of it would be treated as an OpenCL private AS, which doesn't follow OpenCL/SPIR/SPIR-V specification. The error was added in https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2254

svenvh commented 4 months ago

Looking at the reported variable, do you have any float arrays (of 5 elements) that are initialized to some non-zero values inside a function? Perhaps clang generates code that violates the idea behind #2254.

davidrohr commented 4 months ago

Thx for the fast reply! Yes, it is OpenCL code, CLC++2021 to be precise. Ok, I see, that is why the error pops up now. Is there any way I can find out which line in the .cl source file is causing this, or at least ehat is the name of the variable?

davidrohr commented 4 months ago

Hm, we have

const float oldDiag[5] = {mC[0], mC[2], mC[5], mC[9], mC[14]};

in a function, and we have

float mBetheBlochParams[5] = {0.19310481, 4.26696118, 0.00522579, 2.38124907, 0.98055396};

inside a class definition using C++11 initialization. Could this be the problem?

Looking at the reported variable, do you have any float arrays (of 5 elements) that are initialized to some non-zero values inside a function? Perhaps clang generates code that violates the idea behind #2254.

svenvh commented 4 months ago

in a function, and we have

float mBetheBlochParams[5] = {0.19310481, 4.26696118, 0.00522579, 2.38124907, 0.98055396};

inside a class definition using C++11 initialization. Could this be the problem?

The values seem to match indeed; does rewriting this initialization (e.g. by individual assignments or initializing to zero) avoid the crash?

davidrohr commented 4 months ago

I changed the class initialization to using a constructor. This fixes the problem.

However, now I am getting a similar problem elsewhere:

InvalidInstruction: Can't translate llvm instruction:
 Global variable cannot have Function storage class. Consider setting a proper address space.
 Original LLVM value:
@constinit = private constant [2 x [3 x [4 x float]]] [[3 x [4 x float]] [[4 x float] [float 0x3FA5607A20000000, float 0x3F28979AE0000000, float 0x3FACDDB100000000, float 0x3FE13A5BA0000000], [4 x float] [float 0x3FB53BC900000000, float 0x3F2AA55680000000, float 0x3FB1728860000000, float 0x3FEF1225E0000000], [4 x float] [float 0x3FB6358880000000, float 0x3F2B9F09A0000000, float 0x3FC1B60240000000, float 0x3FD05362C0000000]], [3 x [4 x float]] [[4 x float] [float 0x3FAE873A80000000, float 0x3F169EBBC0000000, float 0x3FA285E020000000, float 0x3FDEB379C0000000], [4 x float] [float 0x3FAF5D19A0000000, float 0x3F12FA3520000000, float 0x3FA41FE2A0000000, float 0x3FEDBC3100000000], [4 x float] [float 0x3FB0DB5280000000, float 0x3F1B2B22E0000000, float 0x3FAF1BB7A0000000, float 0x3FEFB073A0000000]]], align 4

Probably another c++11 initialization in a class definition. Perhaps this is a general problem in clang18?

svenvh commented 4 months ago

@davidrohr I've created https://github.com/llvm/llvm-project/pull/90048 to hopefully fix this pattern in clang (targeting clang's main branch). Any chance you could try if that solves the member initializer issues for your use case?

davidrohr commented 4 months ago

@svenvh : Thx: I tried with that PR and it fixes this problem!