SPIR-V Specialization Constants and Pointers

Creating this issue from a comment from an OpenCL 3.0 issue:

https://github.com/KhronosGroup/OpenCL-Docs/issues/294#issuecomment-648970076

Note that for the Kernel execution model, the SPIR-V specification describes OpSpecConstantOp operations using pointer/address operands, the result of which could then be used for other purposes e.g. in determining array lengths in OpTypeArray.

Consider the following sequence of instructions:

OpSpecConstantOp OpConvertPtrToU used to convert a pointer to a global to an integer.

OpBitwiseAnd used to extract the least significant bit of the pointer, to determine its alignment.

OpTypeArray used to declare an array type with a number of elements determined by the pointer's alignment.

In my opinion, it is in no way reasonable to expect that the addresses held in pointers will be known (or knowable) at the time of specialization.

edit: removed an extra word

I think there are essentially two questions:

Is the behavior of specialization constants well-specified for pointers? I believe the answer to this question is "yes", but if there are cases that are not well-specified this would be good to know.
Are specialization constants usable and useful for pointers? This one's a bit trickier to answer, since I certainly don't think it will be common to specialize the value of a pointer, but I think it could be done given pointer address equivalence for SVM or USM. One possible (though again, unlikely) usage model would be to select between two lookup tables based on a specialization constant.

If specialization constants for pointers are well-specified, and if specialization constants for pointers are usable in some (admittedly, narrow) use cases, I don't think this is something we should disallow, and there may be nothing to do here.

I disagree with this. It is possible for a feature to be well-specified and useful, yet infeasible to implement. As specified, we would have to reason about addresses early on in translation, before it is practical to do so for many (possibly most) implementations. If you are able to demonstrate a well-architected proof of concept for this using the Khronos SPIR-V/LLVM Translator, that would change my mind on the matter.

Otherwise, my view is that this would require two implementations of specialization (i.e. one for specializations not derived from memory addresses, but are usable for determining array lengths etc. and another one for specializations that are derived from memory addresses, but not usable for determining array lengths, etc.) leaving a corner case, all for a feature that is not especially valuable. As far as I can see, kernel arguments satisfy the use-cases related to SVM and USM.

If we are to apply the reasoning that specialization constant support should be mandatory because it is widely implemented for Vulkan, then it seems only reasonable to me that we should limit the required functionality for OpenCL to that which is already used by Vulkan.

I'm not following, sorry. This is certainly a problem on my end, but I'm going to need help understanding.

Why is this?

my view is that this would require two implementations of specialization (i.e. one for specializations not derived from memory addresses, but are usable for determining array lengths etc. and another one for specializations that are derived from memory addresses, but not usable for determining array lengths, etc.) leaving a corner case

The way I've been thinking about this is: a pointer specialization constant is essentially just a string of bits, similar to an integer specialization constant. If this is correct (and that's a big if!), why are two implementations of specialization necessary?

To be clear, I agree that this functionality isn't particularly useful, but it was added to the SPIR-V specification from day one, so I think we should be careful removing it (or even making it optional).

Here's an example that motivated some pointer functionality in OpSpecConstantOp in SPIR-V for OpenCL.

In OpenCL 2.0 you can do this:

static constant int params[5] = {1,2,3,4,5};
static constant int *cursor = params + 2;
kernel void foo(global int* A, int offset) {
  cursor += offset;
  A[0] = *cursor;
}

So, how do you represent cursor in SPIR-V? We take the inspiration from LLVM IR.
Abusing clspv somewhat we can see a snapshot of the direct translation into LLVM IR:

 $ clspv --cl-std=CL2.0 a.cl --inline-entry-points -o a.spv --print-before-all 2>z

And then the first full dump of the IR is:

*** IR Dump Before Force set function attributes ***
; ModuleID = 'a.cl'
source_filename = "a.cl"
target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
target triple = "spir-unknown-unknown"

@cursor = internal addrspace(1) global i32 addrspace(2)* bitcast (i8 addrspace(2)* getelementptr (i8, i8 addrspace(2)* bitcast ([5 x i32] addrspace(2)* @params to i8 addrspace(2)*), i64 8) to i32 addrspace(2)*), align 4
@params = internal addrspace(2) constant [5 x i32] [i32 1, i32 2, i32 3, i32 4, i32 5], align 4

; Function Attrs: convergent norecurse nounwind
define spir_kernel void @foo(i32 addrspace(1)* %A, i32 %offset) #0 !kernel_arg_addr_space !3 !kernel_arg_access_qual !4 !kernel_arg_type !5 !kernel_arg_base_type !5 !kernel_arg_type_qual !6 {
entry:
  %A.addr = alloca i32 addrspace(1)*, align 4
  %offset.addr = alloca i32, align 4
  store i32 addrspace(1)* %A, i32 addrspace(1)** %A.addr, align 4
  store i32 %offset, i32* %offset.addr, align 4
  %0 = load i32, i32* %offset.addr, align 4
  %1 = load i32 addrspace(2)*, i32 addrspace(2)* addrspace(1)* @cursor, align 4
  %add.ptr = getelementptr inbounds i32, i32 addrspace(2)* %1, i32 %0
  store i32 addrspace(2)* %add.ptr, i32 addrspace(2)* addrspace(1)* @cursor, align 4
  %2 = load i32 addrspace(2)*, i32 addrspace(2)* addrspace(1)* @cursor, align 4
  %3 = load i32, i32 addrspace(2)* %2, align 4
  %4 = load i32 addrspace(1)*, i32 addrspace(1)** %A.addr, align 4
  %arrayidx = getelementptr inbounds i32, i32 addrspace(1)* %4, i32 0
  store i32 %3, i32 addrspace(1)* %arrayidx, align 4
  ret void
}

attributes #0 = { convergent norecurse nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-builtins" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="0" "stackrealign" "uniform-work-group-size"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!opencl.ocl.version = !{!1}
!opencl.spir.version = !{!1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 2, i32 0}
!2 = !{!"clang version 11.0.0 (https://github.com/llvm/llvm-project 2d6b9dbfef55364fc762682cd8ab93045582944a)"}
!3 = !{i32 1, i32 0}
!4 = !{!"none", !"none"}
!5 = !{!"int*", !"int"}
!6 = !{!"", !""}

The bit to focus on is the module-scope definition: @cursor = internal addrspace(1) global i32 addrspace(2)* bitcast (i8 addrspace(2)* getelementptr (i8, i8 addrspace(2)* bitcast ([5 x i32] addrspace(2)* @params to i8 addrspace(2)*), i64 8) to i32 addrspace(2)*), align 4

In in particular, the value is a complex constant-expression. It has a getelementptr in there for an address calculation. (And sadly, a few bitcasts to and from pointer-to-i8, but let's not get distracted!)

We can translate that getelementptr to a SPIR-V OpSpecConstantOp using OpAccessChain as a sub-opcode.

That's the motivating example. It's exactly because we don't know the address of params at front-end-compiler time that we need to express a deferred computation in order to express the initial value of cursor. The OpSpecConstantOp expresses that deferred calculation.

(Incidentally, clspv then crashes on that example, because it's not meant to support all of OpenCL 2.0)

OpBitwiseAnd used to extract the least significant bit of the pointer, to determine its alignment.

I would challenge that is not portable. You don't know the bit pattern of a pointer. Relying on that is highly system-specific.

OpTypeArray used to declare an array type with a number of elements determined by the pointer's alignment.

I think I agree that sizing an array by a specialization constant is not an OpenCL feature. Perhaps that's the right way to constrain the issue.

I tweaked my example to see what would happen:

static constant int params[5] = {1,2,3,4,5};
static global int others[ params[2]-params[0] ] = {1};
kernel void foo(global int* A, int offset) {
  A[0] = others[offset];
}

I got this error when trying to compile:

b.cl:3:25: error: variable length arrays are not supported in OpenCL
static global int others[ params[2]-params[0] ] = {1};
                        ^

I think I agree that sizing an array by a specialization constant is not an OpenCL feature. Perhaps that's the right way to constrain the issue.

More systematically:

if the use of the spec-constant-op is inside the body of a function, then either:
- it's a variable initializer, in which case replace it with a store of a runtime-computed expression
- replace it with a runtime-computed expression
otherwise, it's in the types-variables-and-constants section.
- if it's an initializer, then ??:
  - if initializer for constant variable, how do you handle the cursor-params example (except where cursor itself is in constant, i.e. declared as static constant int * constant cursor = params + 2;)
  - otherwise, replace with an explicit store of a runtime-computed at the start of the entry point. (only affects Private variables, I think)
- if it's feeding another spec constant value, then we'll recurse this algorithm
- if it's doing something else, like sizing a type, then restrict this feature in the OpenCL env spec? Example is sizing an array, but can't think of others at the top of my head.

That's off the top of my head, but something to start with.

For completeness, here's my first example, but with cursor itself in __constant address space:

static constant int params[5] = {1,2,3,4,5};
static constant int * constant cursor = params + 2;
kernel void foo(global int* A, int offset) {
  A[0] = *(cursor+ offset);
}

And its first module dump

*** IR Dump Before Force set function attributes ***
; ModuleID = 'a.cl'
source_filename = "a.cl"
target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
target triple = "spir-unknown-unknown"

@cursor = internal addrspace(2) constant i32 addrspace(2)* bitcast (i8 addrspace(2)* getelementptr (i8, i8 addrspace(2)* bitcast ([5 x i32] addrspace(2)* @params to i8 addrspace(2)*), i64 8) to i32 addrspace(2)*), align 4
@params = internal addrspace(2) constant [5 x i32] [i32 1, i32 2, i32 3, i32 4, i32 5], align 4

; Function Attrs: convergent norecurse nounwind
define spir_kernel void @foo(i32 addrspace(1)* %A, i32 %offset) #0 !kernel_arg_addr_space !3 !kernel_arg_access_qual !4 !kernel_arg_type !5 !kernel_arg_base_type !5 !kernel_arg_type_qual !6 {
entry:
  %A.addr = alloca i32 addrspace(1)*, align 4
  %offset.addr = alloca i32, align 4
  store i32 addrspace(1)* %A, i32 addrspace(1)** %A.addr, align 4
  store i32 %offset, i32* %offset.addr, align 4
  %0 = load i32 addrspace(2)*, i32 addrspace(2)* addrspace(2)* @cursor, align 4
  %1 = load i32, i32* %offset.addr, align 4
  %add.ptr = getelementptr inbounds i32, i32 addrspace(2)* %0, i32 %1
  %2 = load i32, i32 addrspace(2)* %add.ptr, align 4
  %3 = load i32 addrspace(1)*, i32 addrspace(1)** %A.addr, align 4
  %arrayidx = getelementptr inbounds i32, i32 addrspace(1)* %3, i32 0
  store i32 %2, i32 addrspace(1)* %arrayidx, align 4
  ret void
}

attributes #0 = { convergent norecurse nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-builtins" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="0" "stackrealign" "uniform-work-group-size"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!opencl.ocl.version = !{!1}
!opencl.spir.version = !{!1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 2, i32 0}
!2 = !{!"clang version 11.0.0 (https://github.com/llvm/llvm-project 2d6b9dbfef55364fc762682cd8ab93045582944a)"}
!3 = !{i32 1, i32 0}
!4 = !{!"none", !"none"}
!5 = !{!"int*", !"int"}
!6 = !{!"", !""}

The key difference now is that @cursor itself is in constant not global

It's a happy accident that clspv can actually compile this example into valid SPIR-V for Vulkan. LLVM is smart enough to fold together the address calculation of cursor with the offset so that the cursor variable disappears entirely and we get simple index into the params array. For posterity:

               OpCapability Shader
               OpExtension "SPV_KHR_storage_buffer_storage_class"
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %35 "foo"
               OpSource OpenCL_C 200
               OpDecorate %_runtimearr_uint ArrayStride 4
               OpMemberDecorate %_struct_3 0 Offset 0
               OpDecorate %_struct_3 Block
               OpMemberDecorate %_struct_5 0 Offset 0
               OpMemberDecorate %_struct_6 0 Offset 0
               OpDecorate %_struct_6 Block
               OpDecorate %gl_WorkGroupSize BuiltIn WorkgroupSize
               OpDecorate %33 DescriptorSet 0
               OpDecorate %33 Binding 0
               OpDecorate %34 DescriptorSet 0
               OpDecorate %34 Binding 1
               OpDecorate %_arr_uint_uint_5 ArrayStride 4
               OpDecorate %27 SpecId 0
               OpDecorate %28 SpecId 1
               OpDecorate %29 SpecId 2
       %uint = OpTypeInt 32 0
%_runtimearr_uint = OpTypeRuntimeArray %uint
  %_struct_3 = OpTypeStruct %_runtimearr_uint
%_ptr_StorageBuffer__struct_3 = OpTypePointer StorageBuffer %_struct_3
  %_struct_5 = OpTypeStruct %uint
  %_struct_6 = OpTypeStruct %_struct_5
%_ptr_StorageBuffer__struct_6 = OpTypePointer StorageBuffer %_struct_6
      %float = OpTypeFloat 32
       %void = OpTypeVoid
         %10 = OpTypeFunction %void
%_ptr_StorageBuffer_uint = OpTypePointer StorageBuffer %uint
%_ptr_StorageBuffer__struct_5 = OpTypePointer StorageBuffer %_struct_5
     %uint_5 = OpConstant %uint 5
%_arr_uint_uint_5 = OpTypeArray %uint %uint_5
%_ptr_Private__arr_uint_uint_5 = OpTypePointer Private %_arr_uint_uint_5
%_ptr_Private_uint = OpTypePointer Private %uint
     %v3uint = OpTypeVector %uint 3
%_ptr_Private_v3uint = OpTypePointer Private %v3uint
     %uint_0 = OpConstant %uint 0
     %uint_2 = OpConstant %uint 2
   %uint_100 = OpConstant %uint 100
   %uint_200 = OpConstant %uint 200
   %uint_300 = OpConstant %uint 300
   %uint_400 = OpConstant %uint 400
   %uint_500 = OpConstant %uint 500
         %26 = OpConstantComposite %_arr_uint_uint_5 %uint_100 %uint_200 %uint_300 %uint_400 %uint_500
         %27 = OpSpecConstant %uint 1
         %28 = OpSpecConstant %uint 1
         %29 = OpSpecConstant %uint 1
%gl_WorkGroupSize = OpSpecConstantComposite %v3uint %27 %28 %29
         %31 = OpVariable %_ptr_Private_v3uint Private %gl_WorkGroupSize
         %32 = OpVariable %_ptr_Private__arr_uint_uint_5 Private %26
         %33 = OpVariable %_ptr_StorageBuffer__struct_3 StorageBuffer
         %34 = OpVariable %_ptr_StorageBuffer__struct_6 StorageBuffer
         %35 = OpFunction %void None %10
         %36 = OpLabel
         %37 = OpAccessChain %_ptr_StorageBuffer_uint %33 %uint_0 %uint_0
         %38 = OpAccessChain %_ptr_StorageBuffer__struct_5 %34 %uint_0
         %39 = OpLoad %_struct_5 %38
         %40 = OpCompositeExtract %uint %39 0   ;  get the 'offset' value
         %41 = OpIAdd %uint %uint_2 %40   ;   this is what is folds the address calculation of cursor with the offset index
         %42 = OpAccessChain %_ptr_Private_uint %32 %41 ; do the address calculation
         %43 = OpLoad %uint %42
               OpStore %37 %43
               OpReturn
               OpFunctionEnd

Thanks @dneto0 - this is helpful!

Interestingly, @StuartDBrady and I were just discussing a very similar example: https://godbolt.org/z/gRcLhr

int constant x[] = {0, 1};
int constant * constant y = x + 1;

kernel void test( global int* out )
{
    out[0] = *y;
}

What's interesting about this case is that it generates an OpSpecConstantOp with x as an operand, even though there are no spec constants, at least with my reasonably recent version of the SPIR-V LLVM Translator. I was initially hopeful that we could simply disallow these of operands to OpSpecConstantOp, but unless we can find some other way to express this kernel in SPIR-V, such a restriction may not be possible.

I suppose what's making my brain hurt a little is that we seem to be mixing two concepts in incompatible ways:

x initially is not a "constant instruction" in the SPIR-V sense, since it's an OpVariable:

%x = OpVariable %_ptr_UniformConstant__arr_uint_ulong_2 UniformConstant %8
However, x can be an operand to an OpSpecConstantOp, the result of which is a "constant instruction"?

%14 = OpSpecConstantOp %_ptr_UniformConstant_uint InBoundsPtrAccessChain %x %ulong_0 %ulong_1

It seems strange that we are able to turn something that is "not constant" into something that is "constant". Maybe it's OK in the specific case where the constant result is used as a variable initializer, but any other use seems like it could be very problematic.

If it's helpful, here is my SPIR-V dump for this kernel, generated by:

$ clang -c -cl-std=CL1.2 -target spir64 -emit-llvm -Xclang -finclude-default-header -g0 -O3 global_variable_pointer_math.cl
$ llvm-spirv global_variable_pointer_math.bc -o global_variable_pointer_math.spv

; SPIR-V
; Version: 1.0
; Generator: Khronos LLVM/SPIR-V Translator; 14
; Bound: 25
; Schema: 0
               OpCapability Addresses
               OpCapability Linkage
               OpCapability Kernel
               OpCapability Int64
          %1 = OpExtInstImport "OpenCL.std"
               OpMemoryModel Physical64 OpenCL
               OpEntryPoint Kernel %20 "test"
         %23 = OpString "kernel_arg_type.test.int*,"
               OpSource OpenCL_C 102000
               OpName %x "x"
               OpName %y "y"
               OpDecorate %24 Constant
         %24 = OpDecorationGroup
               OpDecorate %21 FuncParamAttr NoCapture
               OpDecorate %x LinkageAttributes "x" Export
               OpDecorate %y LinkageAttributes "y" Export
               OpDecorate %x Alignment 4
               OpDecorate %y Alignment 8
               OpGroupDecorate %24 %x %y
       %uint = OpTypeInt 32 0
      %ulong = OpTypeInt 64 0
     %uint_0 = OpConstant %uint 0
     %uint_1 = OpConstant %uint 1
    %ulong_2 = OpConstant %ulong 2
    %ulong_0 = OpConstant %ulong 0
    %ulong_1 = OpConstant %ulong 1
%_arr_uint_ulong_2 = OpTypeArray %uint %ulong_2
%_ptr_UniformConstant__arr_uint_ulong_2 = OpTypePointer UniformConstant %_arr_uint_ulong_2
%_ptr_UniformConstant_uint = OpTypePointer UniformConstant %uint
%_ptr_UniformConstant__ptr_UniformConstant_uint = OpTypePointer UniformConstant %_ptr_UniformConstant_uint
       %void = OpTypeVoid
%_ptr_CrossWorkgroup_uint = OpTypePointer CrossWorkgroup %uint
         %19 = OpTypeFunction %void %_ptr_CrossWorkgroup_uint
          %8 = OpConstantComposite %_arr_uint_ulong_2 %uint_0 %uint_1
          %x = OpVariable %_ptr_UniformConstant__arr_uint_ulong_2 UniformConstant %8
         %14 = OpSpecConstantOp %_ptr_UniformConstant_uint InBoundsPtrAccessChain %x %ulong_0 %ulong_1
          %y = OpVariable %_ptr_UniformConstant__ptr_UniformConstant_uint UniformConstant %14
         %20 = OpFunction %void None %19
         %21 = OpFunctionParameter %_ptr_CrossWorkgroup_uint
         %22 = OpLabel
               OpStore %21 %uint_1 Aligned 4
               OpReturn
               OpFunctionEnd

Yes, that's essentially the same example. (I didn't realize godbolt had the OpenCL support, but of course it does!)

x initially is not a "constant instruction" in the SPIR-V sense, since it's an OpVariable:

Right.

But the third condition on an OpSpecConstantOp operand is:

for the AccessChain named opcodes, their Base is allowed to be a global (module scope) OpVariable instruction.

So that lines up from a letter-of-the-law perspective in SPIR-V.

In LLVM land, each module-scope variable declaration is an constant in the LLVM-sense: It's a value (a pointer value) that the compiler does not know the exact value for (because the implementation determines it as late as runtime), but the value does not change value during execution. This is the semantics and scheme as in SPIR-V, it's just that LLVM calls it a constant but SPIR-V does not.

I think the key thing is to always be careful about "when a value is known" vs. "when it can change". It can be known to not change during execution but still you have no idea what its actual value is.

If it helps at all, "specialization constants" can be confusing in related way. Sure, it's "constant" but only after a certain point in the flow. When describing a limited form of it for use in WGSL, I called them "pipeline-overridable" instead, as that might be less confusing. Judge for yourself: https://github.com/gpuweb/gpuweb/pull/886

Another thing to ponder. DirectX doesn't have specialization constants. But it's ok to introduce them into WebGPU/WGSL because in the worst case it means we may have to defer more compilation work to later in the flow. At worst, we defer translation from WGSL to HLSL (for DX11) until pipeline creation time, because that's when spec constant values are finalized. In comparison, the plan for WebGPU over Vulkan is to translate WGSL --> SPIR-V when creating a shader module, and at pipeline creation time use spec-constant overrides. Similarly Metal has "function constants" with similar functionality, we hope to translate WGSL --> MSL at shader module creation time, and override function constants only at pipeline creation time.

I think the key thing is to always be careful about "when a value is known" vs. "when it can change". It can be known to not change during execution but still you have no idea what its actual value is.

I suppose in my head I've been thinking of SPIR-V consumption as going through two conceptual steps:

A "specialization" step, where the values of all OpSpecConstants are known, and all OpSpecConstantOps can be evaluated. After this step, there are no more "specialization constants" and there are only "constants".
After "specialization", compilation can proceed as usual, without any knowledge that specialization constants even exist. This is good because some compilers and compiler IRs (e.g. those based on LLVM) don't have a precise notion of a "specialization constant".

The problem with

for the AccessChain named opcodes, their Base is allowed to be a global (module scope) OpVariable instruction.

is that it breaks (1), since these OpSpecConstantOps cannot be converted to a "constant" during "specialization". This may be fine if you only care that the "spec constant" has a to-be-determined-later constant value (e.g. you can represent it with an LLVM ConstantExpr), but otherwise it seems problematic since the "spec constant" isn't interchangeable with other types of "constants".

Why is this?

my view is that this would require two implementations of specialization (i.e. one for specializations not derived from memory addresses, but are usable for determining array lengths etc. and another one for specializations that are derived from memory addresses, but not usable for determining array lengths, etc.) leaving a corner case

Please read the implementation of specialization constants in the Khronos SPIR-V/LLVM Translator (added in PR 384, commit dd09f1f2f7a406ea26dc1d2f25db3e80c4d922b9), which can be found in lib/SPIRV/SPIRVReader.cpp, under the OpSpecConstant case in transValueWithoutDecoration(). Please also read the implementation of array type handling, in transType() under the OpTypeArray case.

Note that knowledge of the values of constants (including specialization constants) is needed for OpTypeArray translation, here, as shown by the getArrayLength() call, the result of which is passed as the NumElements parameter to ArrayType::get().

This implementation is necessary to allow specialization constants to be used as array lengths. However, it is not sufficient to allow specialization of pointer values.

As ArrayType::get() requires a value, any implementation of OpSpecConstant in the Translator that allows OpSpecConstantOp using pointers for which the value is not known will not be sufficient to allow specialization of array lengths.

Therefore, two implementations would be needed, leaving a corner case.

IMO, it would be better to leave the implementation of OpSpecConstant in the Translator unchanged, and to add only the OpSpecConstantOp operations that are consistent with the existing implementation. Anything beyond this is too experimental, in my view.

Sorry about that, I should have said more explicitly in a reply above that I'm convinced something is broken here, although I'm not entirely sure how to fix it.

One idea I've been tossing around in my head is restricting when the spec constant op case for the AccessChain named opcodes, their Base is allowed to be a global (module scope) OpVariable instruction is valid. Informally, if spec constants computed from the base address of a global scope variable are only used for variable initialization and not for any other usage, would that be acceptable? My thinking is that these usages can safely be replaced by LLVM ConstantExprs, but other usages (such as the array length specialization) cannot.

KhronosGroup / OpenCL-Docs

SPIR-V Specialization Constants and Pointers #344