Closed aleino-nv closed 21 hours ago
After adding the capability, this turns into a problem with emitting WGSL:
This is for buffer-type-splitting
.
Generated WGSL:
@binding(1) @group(0) var s_a_0 : array<, i32(2)>;
@binding(2) @group(0) var s_b_0 : array<, i32(2)>;
@binding(0) @group(0) var<storage, read_write> outputBuffer_0 : array<i32>;
@compute
@workgroup_size(4, 1, 1)
fn computeMain(@builtin(global_invocation_id) dispatchThreadID_0 : vec3<u32>)
{
var i_0 : i32 = vec3<i32>(dispatchThreadID_0).x;
var _S1 : i32 = i_0 * i32(4);
var _S2 : u32 = (s_a_0[i32(0)]).Load<u32 >(_S1);
var _S3 : u32 = (s_a_0[i32(1)]).Load<u32 >(_S1);
var _S4 : u32 = _S2 + _S3 * u32(16);
var _S5 : u32 = (s_b_0[i32(0)]).Load<u32 >(_S1);
var _S6 : u32 = _S4 + _S5 * u32(256);
var _S7 : u32 = (s_b_0[i32(1)]).Load<u32 >(_S1);
outputBuffer_0[i_0] = i32(_S6 + _S7 * u32(4096));
return;
}
slang-test errors:
WGPU error: Error while parsing WGSL: :1:41 error: expected expression for type template argument list
@binding(1) @group(0) var s_a_0 : array<, i32(2)>;
^
:3:41 error: expected expression for type template argument list
@binding(2) @group(0) var s_b_0 : array<, i32(2)>;
^
:14:41 error: expected ';' for variable declaration
var _S2 : u32 = (s_a_0[i32(0)]).Load<u32 >(_S1);
^
:15:41 error: expected ';' for variable declaration
var _S3 : u32 = (s_a_0[i32(1)]).Load<u32 >(_S1);
^
:17:41 error: expected ';' for variable declaration
var _S5 : u32 = (s_b_0[i32(0)]).Load<u32 >(_S1);
^
:19:41 error: expected ';' for variable declaration
var _S7 : u32 = (s_b_0[i32(1)]).Load<u32 >(_S1);
^
- While validating [ShaderModuleDescriptor]
- While calling [Device].CreateShaderModule([ShaderModuleDescriptor]).
Ok it may not be easy to support this after all.
There is no counterpart to RWByteAddressBuffer in WGSL, and there is no uint8 either, so array<u8>
wouldn't work.
@csyonghe I guess we just leave this unsupported for WGSL?
Just use array
HLSL doesn't have uint8 either. ByteAddressBuffer in HLSL is a uint buffer, just so that the index used to access it is always divided by 4.
Just use array.
HLSL doesn't have uint8 either. ByteAddressBuffer in HLSL is a uint buffer, just so that the index used to access it is always divided by 4.
I'm looking at https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-byteaddressbuffer and I can't see any restriction saying that the index used to load from a ByteAddressBuffer should be divisible by 4.
I mean if you have a byte address buffer with contents [0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff]
and you load from index 2, don't you expect to get back 0xffff0000
(assuming little-endian), rather than 0x00000000
?
Wow, ok! That's weird. (Then why isn't ByteAddressBuffer just array
Thanks, I can proceed now.
HLSL never had uint8 type and they need a way to represent something like a untyped buffer, so this is what you get...
After some fixes buffer-type-splitting
now generates
@binding(1) @group(0) var s_a_0 : array<array<u32>, i32(2)>;
@binding(2) @group(0) var s_b_0 : array<array<u32>, i32(2)>;
@binding(0) @group(0) var<storage, read_write> outputBuffer_0 : array<i32>;
@compute
@workgroup_size(4, 1, 1)
fn computeMain(@builtin(global_invocation_id) dispatchThreadID_0 : vec3<u32>)
{
var i_0 : i32 = vec3<i32>(dispatchThreadID_0).x;
var _S1 : i32 = i_0 * i32(4);
var _S2 : u32 = s_a_0[i32(0)][(_S1)/4];
var _S3 : u32 = s_a_0[i32(1)][(_S1)/4];
var _S4 : u32 = _S2 + _S3 * u32(16);
var _S5 : u32 = s_b_0[i32(0)][(_S1)/4];
var _S6 : u32 = _S4 + _S5 * u32(256);
var _S7 : u32 = s_b_0[i32(1)][(_S1)/4];
outputBuffer_0[i_0] = i32(_S6 + _S7 * u32(4096));
return;
}
Which still fails because of issue https://github.com/shader-slang/slang/issues/4987:
WGPU error: Error while parsing WGSL: :1:41 error: an array element type cannot contain a runtime-sized array
@binding(1) @group(0) var s_a_0 : array<array<u32>, i32(2)>;
^^^^^^^^^^
byte-address-buffer
succeeds though, so that can be enabled now
After some fixes
buffer-type-splitting
now generates@binding(1) @group(0) var s_a_0 : array<array<u32>, i32(2)>; @binding(2) @group(0) var s_b_0 : array<array<u32>, i32(2)>; @binding(0) @group(0) var<storage, read_write> outputBuffer_0 : array<i32>; @compute @workgroup_size(4, 1, 1) fn computeMain(@builtin(global_invocation_id) dispatchThreadID_0 : vec3<u32>) { var i_0 : i32 = vec3<i32>(dispatchThreadID_0).x; var _S1 : i32 = i_0 * i32(4); var _S2 : u32 = s_a_0[i32(0)][(_S1)/4]; var _S3 : u32 = s_a_0[i32(1)][(_S1)/4]; var _S4 : u32 = _S2 + _S3 * u32(16); var _S5 : u32 = s_b_0[i32(0)][(_S1)/4]; var _S6 : u32 = _S4 + _S5 * u32(256); var _S7 : u32 = s_b_0[i32(1)][(_S1)/4]; outputBuffer_0[i_0] = i32(_S6 + _S7 * u32(4096)); return; }
Which still fails because of issue #4987:
WGPU error: Error while parsing WGSL: :1:41 error: an array element type cannot contain a runtime-sized array @binding(1) @group(0) var s_a_0 : array<array<u32>, i32(2)>; ^^^^^^^^^^
@csyonghe Do we have some legalization pass that would flatten this case out to a set of two runtime-sized arrays?
This issue was split from https://github.com/shader-slang/slang/issues/5172.
I think this should be easy to support for WGSL.
Affected tests under tests/compute (currently disabled):
Output from slang-test:
Tasks: