KhronosGroup / SPIRV-LLVM-Translator

A tool and a library for bi-directional translation between SPIR-V and LLVM IR
Other
482 stars 216 forks source link

Evaluate support of optimised IR translation #203

Open AnastasiaStulova opened 5 years ago

AnastasiaStulova commented 5 years ago

Current use case of translator assumes that non-optimised IR is given.

A number of questions that were asked recently are as follows:

Naghasan commented 5 years ago

Do we have any idea what issues do we have with optimised IR? Would it be worth running a some tests and collect the issues we find?

One common issue we have with translation of optimized LLVM IR is type narrowing. Some passes will narrow types when they can prove the operation can be performed with a smaller type. This can cause conflicts as OpenCL SPIR-V only allows i8/16/32/64. Common occurrences with our tests is narrowing to i31 or i1 (bool treated as integer). From what I know, this is mainly due to insn combine and the switch statement narrowing pass.

Would support for this be easier to implement in a LLVM backend rather then translation format?

Dealing with optimized IR requires a type legalizer to run. LLVM has infrastructure to deals with this as it impacts all backends but I'm not sure if it can be reused outside the CodeGen infrastructure.

Is this a valuable use case?

To me offline optimization is a valuable use case, it is not because SPIR-V is the only input your driver can consume that you want to prevent all optimizations. Also drivers are usually time constrained but offline compilers are way less constrained.

It also allows users to work around driver bugs.

gfxstrand commented 4 years ago

Here's a very real example where it fails:

struct S {
    char i8_3[3];
};

kernel void test(global struct S *p, float3 v)
{
   int3 tmp;
   frexp(v, &tmp);
   tmp += 1;
   p->i8_3[0] = tmp.x;
   p->i8_3[1] = tmp.y;
   p->i8_3[2] = tmp.z;
}

This example works fine with LLVM10 but with LLVM trunk (as of Aug 14, 2020), it generates this:

  %11 = shufflevector <3 x i32> %10, <3 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
  %12 = bitcast <4 x i32> %11 to i128
  %13 = trunc i128 %12 to i96
  %14 = bitcast i96 %13 to <12 x i8>
  %15 = extractelement <12 x i8> %14, i32 0

This uses i128, i96, and <12 x i8>, none of which are valid types in SPIR-V. I'm not sure what pass causes it.

bader commented 4 years ago

This uses i128, i96, and <12 x i8>, none of which are valid types in SPIR-V. I'm not sure what pass causes it.

@jekstrand, there is a solution for the integer scalars issue, which I described in https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/481. I am going to contribute it to LLVM trunk soon.

I also filed another issue for unsupported vector sizes (https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/645), but unfortunately I don't know a good solution for that problem other than disable "canonicalization" for SPIR target (https://github.com/intel/llvm/pull/2143/files). We probably should discuss the proper solution with LLVM developers.

MrSidims commented 6 months ago

At this point of time the translator supports translation of optimized LLVM IR. Not every vector and integer types are supported, but it is now controlled by datalayout generated by clang for spir target. There are cases of unsupported intrinsics appearing from time to time, but these cases are being fixed when they appear. Should we close this discussion?