Uniform buffer length validation incorrect for arrays with pipeline-defined constant lengths ("Uniform buffer supplied (binding: 0) size (48 is smaller than size of shader uniform: (-16).")

BenLubar commented 1 month ago

Tested versions

I've only tried in 4.2.2. This is very unlikely to be a regression.

System information

Godot v4.2.2.stable - Debian GNU/Linux trixie/sid trixie - Wayland - Vulkan (Forward+) - integrated Intel(R) Graphics (RPL-P) () - 13th Gen Intel(R) Core(TM) i7-1360P (16 Threads)

Issue description

When using RDPipelineSpecificConstant to define a constant that is used as the length of an array, it is impossible to construct a uniform set that godot will allow you to assign to that array even though the graphics card would have accepted the buffer. This is because the expected size ends up being negative.

Additionally, there's a missing parenthesis in the error message, but that's not worth a separate ticket.

Steps to reproduce

run the project; nothing will be displayed in the scene; this runs a compute shader and then prints the result to the output window
without the bug, this will send the squares of the integers 0..31 to the output window
to simulate the bug not existing, change the uniform array length in the glsl shader to a literal 3

Minimal reproduction project (MRP)

Compute Constant Uniform Buffer.zip

akien-mga commented 1 month ago

I don't seem to reproduce the issue on Fedora 40 with either of my AMD GPUs with Mesa 24.1.4.

Vulkan API 1.3.278 - Forward+ - Using Vulkan Device #1: AMD - AMD Radeon RX 7600M XT (RADV NAVI33)
Vulkan API 1.3.278 - Forward+ - Using Vulkan Device #0: AMD - AMD Radeon 780M (RADV GFX1103_R1)

I'm not sure I understood how to reproduce the bug exactly, but is that correct that the MRP is supposed to trigger the bug out of the box? If so it might be GPU / driver specific.

BenLubar commented 1 month ago

Able to reproduce it on my other computer: Godot v4.2.2.stable - Windows 10.0.19045 - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 2060 () - Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (8 Threads)

The output window shows this for a failed (unmodified) run:

Godot Engine v4.2.2.stable.official.15073afe3 - https://godotengine.org
Vulkan API 1.3.278 - Forward+ - Using Vulkan Device #0: NVIDIA - NVIDIA GeForce RTX 2060

The first 32 values of x^2 are: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
--- Debugging process stopped ---

and then on the Debugger -> Errors tab, it shows these three errors:

E 0:00:04:0411   main.gd:36 @ repro(): Uniform buffer supplied (binding: 0) size (48 is smaller than size of shader uniform: (-16).
  <C++ Error>    Condition "buffer->size < (uint32_t)set_uniform.length" is true. Returning: RID()
  <C++ Source>   drivers/vulkan/rendering_device_vulkan.cpp:5759 @ uniform_set_create()
  <Stack Trace>  main.gd:36 @ repro()
                 main.gd:4 @ _ready()

E 0:00:04:0411   main.gd:43 @ repro(): Parameter "uniform_set" is null.
  <C++ Source>   drivers/vulkan/rendering_device_vulkan.cpp:7999 @ compute_list_bind_uniform_set()
  <Stack Trace>  main.gd:43 @ repro()
                 main.gd:4 @ _ready()

E 0:00:04:0411   main.gd:45 @ repro(): Uniforms were never supplied for set (0) at the time of drawing, which are required by the pipeline
  <C++ Error>    Method/function failed.
  <C++ Source>   drivers/vulkan/rendering_device_vulkan.cpp:8200 @ compute_list_dispatch()
  <Stack Trace>  main.gd:45 @ repro()
                 main.gd:4 @ _ready()

If I change the named constant on line 12 of the glsl file to a literal 3, I get this output and no errors in the debugger tab:

Godot Engine v4.2.2.stable.official.15073afe3 - https://godotengine.org
Vulkan API 1.3.278 - Forward+ - Using Vulkan Device #0: NVIDIA - NVIDIA GeForce RTX 2060

The first 32 values of x^2 are: [0, 1, 4, 9, 16, 24.9999980926514, 35.9999961853027, 49.0000038146973, 64, 81, 99.9999923706055, 121.000007629395, 143.999984741211, 169, 196.000015258789, 224.999938964844, 256, 289.000030517578, 323.999908447266, 360.999847412109, 399.999969482422, 440.999938964844, 484.000030517578, 529, 575.999938964844, 624.999877929688, 675.999816894531, 728.999633789063, 784.000061035156, 840.999694824219, 899.999755859375, 960.999572753906]
--- Debugging process stopped ---

(which is close enough to what the integers should be for this toy example)

or on my laptop, which either uses more precise floats or rounds the results:

Godot Engine v4.2.2.stable.official.15073afe3 - https://godotengine.org
Vulkan API 1.3.278 - Forward+ - Using Vulkan Device #0: Intel - Intel(R) Graphics (RPL-P)

The first 32 values of x^2 are: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961]
--- Debugging process stopped ---

clayjohn commented 1 month ago

I can confirm the bug on an M2 Macbook (Vulkan API 1.2.231 - Forward+ - Using Vulkan Device #0: Apple - Apple M2).

I suspect that the issue comes from the specialization constant not being initialized correctly. Taking a deeper look now

Edit I can also reproduce on my linux computer (Vulkan API 1.3.274 - Forward+ - Using Vulkan Device #0: Intel - Intel(R) Xe Graphics (TGL GT2))

clayjohn commented 1 month ago

Taking a closer look. It seems that the values are set correctly. I suspect that the issue is in our shader validation code. We extract some reflection data from the shader when it is compiled to SPIRV. However, at the time we compile to SPIRV, we don't have a stable value for the specialization constants (we have defaults only). At run time, we use that reflection data to validate the inputs to the shader. But since we don't know how long the uniform buffer is supposed to be the validation fails and leads to the error you see.

If my theory is correct the solution is going to be very difficult. I see the following options:

Detect if a specialization constant is used as the size of the uniform buffer array and disable validation in that case
Ban using specialization constants to set the size of a uniform array
Track if a specialization constant is used to size an array, then validate the value from the specialization constant matches the length of the array at run time (probably impossible)

CC @RandomShaper who may also be interested in this.

For now I would avoid using a specialization constant to set the size of a uniform buffer array.

pink-arcana commented 1 month ago

Compute shaders also fail to compile if a specialization constant is used as the offset in a texture + offset function (tested in 4.3 beta). The error reports that a compile-time constant is required. (Per this glslang issue, it appears they were intended to be considered compile-time constants.)

This error is at compilation, not runtime like the array length, but is it possibly a similar root cause in how they're treated by the compiler?

RandomShaper commented 1 month ago

Reproduced on 4.2.

Cannot repro in 4.3 because reflection retrieves 16 as the uniform buffer size (instead of -16). It's at least correct, but still fused to the default value of the spec constant. To reproduce it there, the default value of POLYNOMIAL_DEGREE_PLUS_ONE in the .glsl file has to be raised to at least 4 (so the buffer size is reported as 64, bigger than the 48 bytes being provided).

@clayjohn's analysis is accurate. Option 1 is feasible. And that's so true that... #94985

godotengine / godot