KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
777 stars 407 forks source link

Detection of uninitialized descriptors no longer seems to work since Vulkan SDK release 1.3.280.1 #8754

Open elmar-k opened 1 month ago

elmar-k commented 1 month ago

Environment:

Dear all,

when I updated to the latest SDK 1.3.296.0, the detection of uninitialized descriptors stopped working. If I comment out the call to vkUpdateDescriptorSets and thus run my compute shaders without valid descriptors, I no longer get a validation error.

I sequentially downloaded all older SDKs, and found that 1.3.280.1 to 1.3.296.0 silently ignored the error, while 1.3.275.0 gives the proper error message:

(Validation Error: [ VUID-vkCmdDispatch-None-08114 ] Object 0: handle = 0x359e9300000000cb, type = VK_OBJECT_TYPE_DESCRIPTOR_SET; | MessageID = 0x30b6e267 | vkCmdDispatch(): the descriptor (VkDescriptorSet 0x359e9300000000cb[], binding 9, index 0) is being used in draw but has never been updated via vkUpdateDescriptorSets() or a similar call. The Vulkan spec states: Descriptors in each bound descriptor set, specified via vkCmdBindDescriptorSets, must be valid as described by descriptor validity if they are statically used by the VkPipeline bound to the pipeline bind point used by this command and the bound VkPipeline was not created with VK_PIPELINE_CREATE_DESCRIPTOR_BUFFER_BIT_EXT (https://vulkan.lunarg.com/doc/ view/1.3.275.0/linux/1.3-extensions/ vkspec.html#VUID-vkCmdDispatch-None-08114)

So "something" seems to have happened between 1.3.275.0 and 1.3.280.1.

Best regards, Elmar

spencer-lunarg commented 1 month ago

Sorry for making you try and grab the SDK versions

tl;dr - This is an very active (like PRs this week active) issue and it is being worked on


Questions for you

  1. Are you using Descriptor Indexing?
  2. Are you using GPU-AV or just "normal" core validation?

You can have your descriptors look like one of the following

// Style A
layout(set = 0, binding = 0) uniform sampler3D A[1];
// Style B
layout(set = 0, binding = 0) uniform sampler3D B[];
// Style C
layout(set = 0, binding = 0) uniform sampler3D C;

What we found is for people using Descriptor Indexing, these are all valid to not initialized if you are not accessing them in your shader

so people are legally allowed to go

layout(set = 0, binding = 0) uniform sampler3D C;

if (always_false_condition) { 
    texture(c);
}

and there is no reasonable good way on the CPU to statically detect this in the SPIR-V.... so we have to defer it to GPU-AV

We have found other things (like Sync Validation) also needs to know "what things were actually touched on the GPU" to determine if there was an access or not.

So with this, we have had to slowly redo GPU-AV to allow a way to quickly detect if things are accessed or not... in the meantime, we decided to remove all false positives (to keep the integrity of the tool) over some people hitting the situation you hit.

elmar-k commented 1 month ago

Hi Spencer!

Many thanks for the quick reply. No, I'm not doing anything like descriptor indexing or using an 'always_false_condition'. Just plain standard descriptors, and if I comment out the call to vkUpdateDescriptorSets and thus run my compute shaders without valid descriptors (and producing nonsense results), I don't get an error from VK_LAYER_KHRONOS_validation.

In the next step, I'll try the same with the vkcube example app, to see if the problem is triggered by my OS or by my app. If it happened for everyone, then I guess people would have noticed it since March, when 1.3.280.1 was released.

Best regards, Elmar

spencer-lunarg commented 1 month ago

Just plain standard descriptors, and if I comment out the call to vkUpdateDescriptorSets

Ok, this seems odd, we have many tests for this, so now curious what going on.. without a good way to reproduce, not sure what to say.

There a way you can share what the descriptor looks like in your shader and if you are using any flags/settings when creating your descriptor set/descriptor set layout

Also I assume this is using normal Pipelines (not Graphics Pipeline Libraries or Shader Objects)

elmar-k commented 1 week ago

I'm one step further: I'm actually creating two Vulkan devices, one for graphics, and one for separate compute. If I comment out the call to vkUpdateDescriptorSets on the main graphics device, I immediately get the correct error from the validation layer also with the latest version. But if I comment out the call to vkUpdateDescriptorSets on the compute-only device, I don't get a validation error, unless I switch back to SDK 1.3.275.0. The problem persists if I run my app in text console mode without creating any graphics device.

In short: I'm in a rare niche doing Vulkan compute without graphics, which may explain why others and your testsuite hasn't encountered the problem. If you want, I can e-mail my app for testing....