microsoft / hlsl-specs

HLSL Specifications
MIT License
124 stars 35 forks source link

Inline SPIR-V builtin declaration change #350

Open Keenuts opened 2 weeks ago

Keenuts commented 2 weeks ago

Which document does this relate to? https://github.com/microsoft/hlsl-specs/blob/main/proposals/0011-inline-spirv.md

Describe the issue you see with the spec

In the initial PR (https://github.com/microsoft/hlsl-specs/pull/59) builtins were added as functions:

[[vk::ext_builtin_input(/* BuilIn ID */ 1234)]]
uint myBuiltin();

[[vk::ext_builtin_output(/* BuilIn ID */ 5678)]]
void myOtherBuiltin(uint output);

A later PR changed those to be variables, to better reflect the underlying SPIR-V (https://github.com/microsoft/hlsl-specs/pull/129).

[[vk::ext_builtin_input(/* BuilIn ID */ 1234)]]
static const uint myBuiltin;

[[vk::ext_builtin_output(/* BuilIn ID */ 5678)]]
static uint myOtherBuiltin;

static was used here because since builtin have either the Input or Output storage class, those are only visible to the current invocation. Issue is: on the LLVM side, static means private to the module, and since we compile the whole module, many optimizations are allowed. For example, all loads can be removed, and the input bits can be considered to be invalid, and all stores to the output variable can be removed as long as no load is done after (which should not happen in a shader).

The simple alternative I'd suggest is to use extern:

The drawback is this doesn't carry the info the variable is locale to the invocation. Adding a thread_local storage class could be a solution, but this keyword is not available in HLSL.

Suggested syntax:

[[vk::ext_builtin_input(/* BuilIn ID */ 1234)]]
extern const uint myBuiltin;

[[vk::ext_builtin_output(/* BuilIn ID */ 5678)]]
extern uint myOtherBuiltin;

The alternative is to fallback on a function style builtin. For loads, I think as long as we mark the builtin function as readnone, compiler might be able to load once. For store however, the compiler won't be able to optimize away 2 subsequent store, unless we hack around and delay the function call to the end of the function.

Keenuts commented 2 weeks ago

Turns out thread_local request was already made in https://github.com/microsoft/hlsl-specs/issues/173

This would even allow:

[[vk::ext_builtin_input(/* BuilIn ID */ 1234)]]
extern thread_local const uint myBuiltin;

[[vk::ext_builtin_output(/* BuilIn ID */ 5678)]]
extern thread_local uint myOtherBuiltin;
devshgraphicsprogramming commented 1 week ago

maybe its time to actually expose Storage Spaces properly (workgroup/shared, private and function could have the same name since that depends solely on declaration site, device) ?