Open ardlak opened 1 year ago
I can confirm this on 4.1.dev d6dde819b (Linux, GeForce RTX 4090 with NVIDIA 530.41.03):
ERROR: Vulkan: Cannot submit graphics queue. Error code: VK_ERROR_DEVICE_LOST
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2357)
ERROR: Vulkan: Cannot submit graphics queue. Error code: VK_ERROR_DEVICE_LOST
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2357)
ERROR: Vulkan: Did not create swapchain successfully. Error code: VK_NOT_READY
at: prepare_buffers (drivers/vulkan/vulkan_context.cpp:2280)
ERROR: Vulkan: Cannot submit graphics queue. Error code: VK_ERROR_DEVICE_LOST
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2357)
ERROR: Vulkan: Did not create swapchain successfully. Error code: VK_NOT_READY
at: prepare_buffers (drivers/vulkan/vulkan_context.cpp:2280)
ERROR: Vulkan: Cannot submit graphics queue. Error code: VK_ERROR_DEVICE_LOST
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2357)
ERROR: Vulkan: Did not create swapchain successfully. Error code: VK_NOT_READY
at: prepare_buffers (drivers/vulkan/vulkan_context.cpp:2280)
ERROR: Vulkan: Cannot submit graphics queue. Error code: VK_ERROR_DEVICE_LOST
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2357)
Modifying the shader without the ShaderMaterial preview being visible in the inspector does not result in a freeze.
This likely occurs because the uniform buffer size limit (or some other limit) is exceeded, but the editor or shader compiler doesn't check for limits.
The issue still occurs if replacing all vec3
s in the shader with vec2
s.
Running with Vulkan validation layers installed and --gpu-validation --gpu-abort
returns the following:
ERROR: VALIDATION - Message Id Number: -1553903733 | Message Id Name: VUID-RuntimeSpirv-Location-06272
Validation Error: [ VUID-RuntimeSpirv-Location-06272 ] Object 0: VK_NULL_HANDLE, type = VK_OBJECT_TYPE_PIPELINE; | MessageID = 0xa3614f8b | Invalid Pipeline CreateInfo State: Vertex shader output variable uses location that exceeds component limit VkPhysicalDeviceLimits::maxVertexOutputComponents (128) The Vulkan spec states: The sum of Location and the number of locations the variable it decorates consumes must be less than or equal to the value for the matching {ExecutionModel} defined in Shader Input and Output Locations (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-RuntimeSpirv-Location-06272)
Objects - 1
Object[0] - VK_OBJECT_TYPE_PIPELINE, Handle 0
at: _debug_messenger_callback (drivers/vulkan/vulkan_context.cpp:267)
ERROR: Crashing, because abort on GPU errors is enabled.
at: _debug_messenger_callback (drivers/vulkan/vulkan_context.cpp:268)
I did a little more looking into it:
varying mat4 m; // lowercase
// lowercase
varying vec3 vA01;
varying vec3 vA02;
varying vec3 vA03;
...
varying vec3 vA17;
varying vec3 vA18;
void vertex() {
vA17 = vec3(1.); // safe
vA18 = vec3(1.); // will freeze
}
varying mat4 m; // lowercase
//uppercase
varying vec3 Va01;
varying vec3 Va02;
varying vec3 Va03;
...
varying vec3 Va21;
varying vec3 Va22;
void vertex() {
Va21 = vec3(1.); // safe
Va22 = vec3(1.); // will freeze
}
Switching the case of the matrix name in the second example:
varying mat4 M; // uppercase
...
void vertex() {
Va17 = vec3(1.); // safe
Va18 = vec3(1.); // will freeze
}
So I suppose there are separate considerations for uppercase and lowercase variables. The limit in both examples seems to be 21 vectors regardless of type. Adding a mat4
adds four vectors, lowering the limit of specific vector declarations by four. mat3
and mat2
types work similarly, lowering the limit by however many vectors make up the type.
Additionally, if you combine the vector declarations from both examples:
varying vec3 vA01;
varying vec3 vA02;
...
varying vec3 Va20;
void vertex() {
vA01 = vec3(1); // safe
vA02 = vec3(1); // will freeze
}
No matter which order you declare the cases in, the uppercase ones seem to take priority and act as if they were declared first.
We need to add a clear user-facing error when users exceed the number of varyings supported by user hardware.
Godot uses up to 11 varyings and it reserves the slots for those 11. Vulkan devices are only guaranteed to support 16 varyings (64 components / 4), but most devices support 32 https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxVertexOutputComponents&platform=all (except apple devices which seem to have one less)
Interestingly, I don't get a crash on Windows 11 on the same GeForce RTX 4090 (NVIDIA 531.61), even if --gpu-validation --gpu-abort
is used:
ERROR: Vulkan Debug Report: object - -2792181191434809889
Validation Error: [ VUID-RuntimeSpirv-Location-06272 ] Object 0: handle = 0xd9402e0000004ddf, type = VK_OBJECT_TYPE_SHADER_MODULE; | MessageID = 0xa3614f8b | vkCreateGraphicsPipelines(): pCreateInfos[0] Vertex shader output variable uses location that exceeds component limit VkPhysicalDeviceLimits::maxVertexOutputComponents (128) The Vulkan spec states: The sum of Location and the number of locations the variable it decorates consumes must be less than or equal to the value for the matching {ExecutionModel} defined in Shader Input and Output Locations (https://vulkan.lunarg.com/doc/view/1.3.243.0/windows/1.3-extensions/vkspec.html#VUID-RuntimeSpirv-Location-06272)
at: _debug_report_callback (drivers/vulkan/vulkan_context.cpp:300)
ERROR: Vulkan Debug Report: object - -7333046566405517856
Validation Error: [ VUID-RuntimeSpirv-Location-06272 ] Object 0: handle = 0x9a3bc90000004de0, type = VK_OBJECT_TYPE_SHADER_MODULE; | MessageID = 0xa3614f8b | vkCreateGraphicsPipelines(): pCreateInfos[0] Fragment shader input variable uses location that exceeds component limit VkPhysicalDeviceLimits::maxFragmentInputComponents (128) The Vulkan spec states: The sum of Location and the number of locations the variable it decorates consumes must be less than or equal to the value for the matching {ExecutionModel} defined in Shader Input and Output Locations (https://vulkan.lunarg.com/doc/view/1.3.243.0/windows/1.3-extensions/vkspec.html#VUID-RuntimeSpirv-Location-06272)
at: _debug_report_callback (drivers/vulkan/vulkan_context.cpp:300)
The editor doesn't freeze at all, it just continues rendering after a small fraction of a second and can still be used. This indicates that --gpu-abort
may not be working correctly on Windows. I'm using Vulkan SDK 1.3.243.0.
Just flagging that this is still an issue in 4.3-dev 5 on a NVIDIA GeForce GTX 980 Ti.
To get back into the project I had to corrupt the scenes using that offending shader (by renaming it).
In my case any more than 32 components (ie. floats) in varyings caused the freeze.
Perhaps if it is a complicaed fix (accounting for hardware, etc) a note in the shaders section of the documentation could support users in identifying the cause of the problem?
Godot version
4.0.2.stable.official
System information
Windows 10, NVIDIA GeForce RTX 2080 SUPER, Driver 516.94, Vulkan backend
Issue description
If there is a shadermaterial being rendered when a varying vector that exceeds a declaration limit is constructed within the vertex function of its shader, the editor will freeze.
Example code:
Steps to reproduce
Minimal reproduction project
issue_just_why.zip