KhronosGroup / SPIRV-LLVM-Translator

A tool and a library for bi-directional translation between SPIR-V and LLVM IR
Other
468 stars 209 forks source link

Missing dimidx checks when translating get_global_offset #2638

Open bashbaug opened 2 months ago

bashbaug commented 2 months ago

For the following kernel, the dimidx value passed to get_global_offset will be out-of-range for at least some calls regardless of the value of j:

kernel void dynamic(global int* out, int j) {
    out[0] = get_global_offset(j-2);
    out[1] = get_global_offset(j-1);
    out[2] = get_global_offset(j);
    out[3] = get_global_offset(j+1);
    out[4] = get_global_offset(j+2);
}

The OpenCL C spec has defined out-of-range behavior for get_global_offset though, so this program is correct:

Valid values of dimindx are 0 to get_work_dim() - 1. For other values, get_global_offset() returns 0.

However, when I generate SPIR-V using Clang and the SPIR-V LLVM Translator there are no range checks:

    %dynamic = OpFunction %void None %9
         %11 = OpFunctionParameter %_ptr_CrossWorkgroup_uint
         %12 = OpFunctionParameter %uint
         %13 = OpLabel
         %15 = OpIAdd %uint %12 %uint_4294967294
         %16 = OpLoad %v3ulong %__spirv_BuiltInGlobalOffset Aligned 32
         %17 = OpVectorExtractDynamic %ulong %16 %15
         %18 = OpUConvert %uint %17
               OpStore %11 %18 Aligned 4
         %20 = OpIAdd %uint %12 %uint_4294967295
         %21 = OpLoad %v3ulong %__spirv_BuiltInGlobalOffset Aligned 32
         %22 = OpVectorExtractDynamic %ulong %21 %20
         %23 = OpUConvert %uint %22
         %25 = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %11 %ulong_1
               OpStore %25 %23 Aligned 4
         %26 = OpLoad %v3ulong %__spirv_BuiltInGlobalOffset Aligned 32
         %27 = OpVectorExtractDynamic %ulong %26 %12
         %28 = OpUConvert %uint %27
         %30 = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %11 %ulong_2
               OpStore %30 %28 Aligned 4
         %32 = OpIAdd %uint %12 %uint_1
         %33 = OpLoad %v3ulong %__spirv_BuiltInGlobalOffset Aligned 32
         %34 = OpVectorExtractDynamic %ulong %33 %32
         %35 = OpUConvert %uint %34
         %37 = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %11 %ulong_3
               OpStore %37 %35 Aligned 4
         %39 = OpIAdd %uint %12 %uint_2
         %40 = OpLoad %v3ulong %__spirv_BuiltInGlobalOffset Aligned 32
         %41 = OpVectorExtractDynamic %ulong %40 %39
         %42 = OpUConvert %uint %41
         %44 = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %11 %ulong_4
               OpStore %44 %42 Aligned 4
               OpReturn
               OpFunctionEnd

I think there needs to be some sort of dimidx check when translating get_global_offset, to have the proper out-of-range behavior. These will be removed in most common cases, such as when the dimidx is an integer literal, but they will probably need to be preserved when the index is unknown like it is in the example above.

Since the OpenCL SPIR-V Environment spec requires the GlobalOffset BuiltIn to have three components we probably can use that to simplify the range checks vs. querying the work dimension?

As an aside, we should also clarify in the OpenCL SPIR-V Environment spec what happens to the upper components of the GlobalOffset BuiltIn (and other vector built-ins) when the ND-range dimension is smaller than three.

karolherbst commented 2 months ago

I think this depends on how much the OpenCL C spec defines those builtins. There is the The mapping from an OpenCL C built-in function to the SPIR-V BuiltIn is informational and non-normative. sentence, but not quite sure if that actually defines anything.

Which means, that those builtins aren't defined as not even the core SPIR-V spec defines them. So I think technically using all of those builtins is strictly speaking undefined behavior to begin with and I was already wondering if something should be done about that.

karolherbst commented 2 months ago

On the other hand, out of bound accesses are also not defined, so maybe that's a good reason on why to fix it in the translator.