KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
751 stars 403 forks source link

VK_KHR_copy_commands2 validation missing #8634

Closed M2-TE closed 1 day ago

M2-TE commented 2 days ago

Environment:

Describe the Issue

VK_KHR_copy_commands2 is not core under Vulkan 1.2 and it seems like there is no validation check for its presence when calling VkCopyBufferToImage2. Used Vulkan C++ bindings with dynamic dispatch.

The program crashes at a runtime assertion, which provides sufficient information in debug builds, so I am unsure whether a validation check would even be intended here.

Expected behavior

Validation for presence of VK_KHR_copy_commands2 extension on calling vkCmdBlitImage2KHR vkCmdCopyBuffer2KHR vkCmdCopyBufferToImage2KHR vkCmdCopyImage2KHR vkCmdCopyImageToBuffer2KHR vkCmdResolveImage2KHR and printing an appropriate message.

Valid Usage ID

Additional context

code or terminal output ```sh void vk::CommandBuffer::copyBufferToImage2(const vk::CopyBufferToImageInfo2&, const Dispatch&) const [with Dispatch = vk::DispatchLoaderDynamic]: Assertion `d.vkCmdCopyBufferToImage2 && "Function requires or "' failed. Aborted (core dumped) ```
spencer-lunarg commented 2 days ago

So we 100% have coverage for VK_KHR_copy_commands2 and tests for it

This seems like an issue less with the VVL and more with getting into the Validation Layers

If you are using vkCmdCopyBufferToImage2KHR someone needs to have called vkGetDeviceProcAddr to get the pointer

M2-TE commented 2 days ago

I am not directly interacting with vkGetDeviceProcAddr, since I use vulkan-hpp with the dynamic dispatcher, which is initialized like this:

VULKAN_HPP_DEFAULT_DISPATCHER.init();
...
VULKAN_HPP_DEFAULT_DISPATCHER.init(instance);
...
VULKAN_HPP_DEFAULT_DISPATCHER.init(device);
spencer-lunarg commented 2 days ago

@asuessenbach you have any idea here?

spencer-lunarg commented 2 days ago

@M2-TE do you have VK_KHR_copy_commands2 enabled?

M2-TE commented 2 days ago

@spencer-lunarg Everything runs as it should with the VK_KHR_copy_commands2 extension enabled, or are you referring to a validation setting within VVL?

spencer-lunarg commented 2 days ago

So you listed the above assert

void vk::CommandBuffer::copyBufferToImage2(const vk::CopyBufferToImageInfo2&, const Dispatch&) const [with Dispatch = vk::DispatchLoaderDynamic]: Assertion `d.vkCmdCopyBufferToImage2 && "Function requires or "' failed. Aborted (core dumped)

but this is not anything in Validation Layers.. I guess you are saying Validation is missing, but I see nothing showing you even got into validation. Is there a call stack of where the crash occurs? I am curious where in Validation it is crashing

One advice is to run with export VK_LOADER_DEBUG=all and see if it shows anything from the loader

M2-TE commented 2 days ago

I should have clarified, the assert has nothing to do with validation layers themselves, it just shows that something went wrong during the copyBufferToImage2 call when the extension is missing. I was suprised by the lack of validation output regarding something only caught via assert in debug mode (I reckon these asserts are only present in vk-hpp?)

Gonna look at the call stack tomorrow and see if I can provide you with some more info there and will also look into getting some debug output from the loader

M2-TE commented 1 day ago

Tried to trace the callstack using different GPU than before, but same results regardless. Here is where it crashes:

vulkan_funcs.hpp (7787-7797)

  template <typename Dispatch>
  VULKAN_HPP_INLINE void CommandBuffer::copyBufferToImage2( const VULKAN_HPP_NAMESPACE::CopyBufferToImageInfo2 & copyBufferToImageInfo,
                                                            Dispatch const &                                     d ) const VULKAN_HPP_NOEXCEPT
  {
    VULKAN_HPP_ASSERT( d.getVkHeaderVersion() == VK_HEADER_VERSION );
#  if ( VULKAN_HPP_DISPATCH_LOADER_DYNAMIC == 1 )
    VULKAN_HPP_ASSERT( d.vkCmdCopyBufferToImage2 && "Function <vkCmdCopyBufferToImage2> requires <VK_KHR_copy_commands2> or <VK_VERSION_1_3>" );
#  endif

    d.vkCmdCopyBufferToImage2( m_commandBuffer, reinterpret_cast<const VkCopyBufferToImageInfo2 *>( &copyBufferToImageInfo ) );
  }

commenting out the second assert leads to calling the vkCmdCopyBufferToImage2, which simply segfaults. The callstack contains no extra information sadly, it goes from the application call into this, then crashes out.

Here is some of the relevant loader debug output

creating instance:

DRIVER:            Found ICD manifest file /usr/share/vulkan/icd.d/nvidia_icd.json, version 1.0.1
DEBUG | DRIVER:    Searching for ICD drivers named libGLX_nvidia.so.0
DEBUG | LAYER:     Loading layer library libVkLayer_khronos_validation.so
INFO | LAYER:      Insert instance layer "VK_LAYER_KHRONOS_validation" (libVkLayer_khronos_validation.so)
DEBUG | LAYER:     Loading layer library libVkLayer_MESA_device_select.so
INFO | LAYER:      Insert instance layer "VK_LAYER_MESA_device_select" (libVkLayer_MESA_device_select.so)
LAYER:             vkCreateInstance layer callstack setup to:
LAYER:                <Application>
LAYER:                  ||
LAYER:                <Loader>
LAYER:                  ||
LAYER:                VK_LAYER_MESA_device_select
LAYER:                        Type: Implicit
LAYER:                            Disable Env Var:  NODEVICE_SELECT
LAYER:                        Manifest: /usr/share/vulkan/implicit_layer.d/VkLayer_MESA_device_select.json
LAYER:                        Library:  libVkLayer_MESA_device_select.so
LAYER:                  ||
LAYER:                VK_LAYER_KHRONOS_validation
LAYER:                        Type: Explicit
LAYER:                        Manifest: /usr/share/vulkan/explicit_layer.d/VkLayer_khronos_validation.json
LAYER:                        Library:  libVkLayer_khronos_validation.so
LAYER:                  ||
LAYER:                <Drivers>

creating device:

INFO | LAYER:      Inserted device layer "VK_LAYER_KHRONOS_validation" (libVkLayer_khronos_validation.so)
INFO | LAYER:      Failed to find vkGetDeviceProcAddr in layer "libVkLayer_MESA_device_select.so"
DRIVER | LAYER:    vkCreateDevice layer callstack setup to:
DRIVER | LAYER:       <Application>
DRIVER | LAYER:         ||
DRIVER | LAYER:       <Loader>
DRIVER | LAYER:         ||
LAYER:                VK_LAYER_KHRONOS_validation
LAYER:                        Type: Explicit
LAYER:                        Manifest: /usr/share/vulkan/explicit_layer.d/VkLayer_khronos_validation.json
LAYER:                        Library:  libVkLayer_khronos_validation.so
LAYER:                  ||
DRIVER | LAYER:       <Device>
DRIVER | LAYER:           Using "NVIDIA GeForce RTX 4070 Ti SUPER" with driver: "libGLX_nvidia.so.0"

there is no output from the loader around the time where vkCmdCopyBufferToImage2 is called

spencer-lunarg commented 1 day ago

so there is a big difference between vkCmdCopyBufferToImage2 and vkCmdCopyBufferToImage2KHR

When we promote extensions to core, we can just make an alias name for a enum or struct in C/C++, but function pointers are different

If your not creating a Vulkan 1.3 app, there is no way any 1.2 driver you might run your application against will know about the newer vkCmdCopyBufferToImage2 so you need to be calling

CommandBuffer::copyBufferToImage2KHR if you are using a VkApplicationInfo::apiVersion that is less than 1.3

M2-TE commented 1 day ago

I had no idea there was an actual difference between the two, thought the KHR versions were just a typedef for compat.. thank you for the explanation!

spencer-lunarg commented 1 day ago

thought the KHR versions were just a typedef for compat.. thank you for the explanation!

yes, but unfortunately C++ function pointers can be unique... this is a common mistake with promotions, it has been brought up before and it is hard to actually detect... but glad to know you have things working now!!