KhronosGroup / SPIR

Other
178 stars 49 forks source link

[[cl::unroll_hint]] and [[cl::ivdep]] hints are not passed to SPIR-V #62

Open jszuppe opened 7 years ago

jszuppe commented 7 years ago

If I recall correctly, it is not specified in any document if OpenCL C++-to-SPIR-V compiler should always pass [[cl::unroll_hint]] and [[cl::ivdep]] (ignore vector dependencies) hints to SPIR-V, or if OpenCL C++-to-SPIR-V compiler can decide not unroll the loop and ignore unroll hint.

However, in my opinion, since SPIR-V is an intermediate language between human-readable OpenCL (and other languages) and hardware-specific byte code, OpenCL C++-to-SPIR-V compiler should compile loops with [[cl::unroll_hint]] and [[cl::ivdep]] attributes. That is, it should compile those loops to structured loops (see StructuredControlFlow) with OpLoopMerge instruction with information about the hints, so that later SPIR-V-to-hardware-specific-byte-code compiler can make a decision whether to unroll or vectorize the loop.

Currently, [[cl::unroll_hint]] and [[cl::ivdep]] hints are ignored and are not passed SPIR-V.

bsochack commented 7 years ago

Have you tried to output LLVM instead of SPIR-V to check if this is Clang compiler issue or LLVM to SPIR-V converter (https://github.com/KhronosGroup/SPIRV-LLVM/tree/khronos/spirv-3.6.1)?

jszuppe commented 7 years ago

No I haven't, thanks for the idea. I'll check it today.

bashbaug commented 7 years ago

I took an action item in today's Khronos call to propose spec text to clarify this behavior. Here's what I have so far:

=== Attribute Qualifiers

The +[[ ]]+ attribute qualifier syntax allows additional attributes to be attached to types, variables, kernel functions, kernel parameters, or loops.

While some attributes are required for program correctness, other attributes are hints and may be ignored by frontend compilers compiling OpenCL {cpp} to an intermediate representation, or by device compilers compiling to device code. Frontend compilers that compile to an intermediate representation are encouraged (but not required) to faithfully pass attribute hints with an intermediate representation to device compilers for further processing.

I think this is OK but I'm not particularly happy about it. Among other things:

So, I'm open to suggestions for improvement (@bsochack?), or perhaps validation that the text above is good enough, in which case I'll open a merge request with this addition.

Thanks!

yxsamliu commented 7 years ago

It seems clang generates loop metadata for cl::unroll_hint and cl::ivdep in LLVM IR. e.g.

include

using namespace cl;

kernel void worker(global a, global b) { [[ cl::unroll_hint(2) ]] [[ cl::ivdep ]] for (uint i=0; i<16; ++i) a[i] = b[i]; }

The IR is like

br i1 %5, label %6, label %18, !llvm.loop !6

!6 = distinct !{!6, !7, !8} !7 = !{!"opencl_ivdep"} !8 = !{!"llvm.loop.unroll.count", i32 2}

However, LLVM/SPIRV converter currently is unable to recover the loop structure and associated loop info from LLVM IR.