KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
751 stars 403 forks source link

Spec violation in vkBindImageMemory() validation with VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT #7831

Closed cgutman closed 4 months ago

cgutman commented 5 months ago

Environment:

Describe the Issue

The new validation introduced in https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/7199 regressed the UB fix made in ce2c6eff3457a331efd64da800c5ac393285cf40 and reintroduced the crashes in reported in #5649 and #5687 (though now it's now crashing during vkBindImageMemory() rather than vkCreateImage() ). AFAICT, this regression was first released in v1.3.275 and persists today in v1.3.280.

This happens because Mesa assumes that the pNext chain passed to vkGetPhysicalDeviceImageFormatProperties2() always includes includes VkPhysicalDeviceImageDrmFormatModifierInfoEXT, which is a specification requirement per VUID-VkPhysicalDeviceImageFormatInfo2-tiling-02249. The relevant Mesa code is here: https://gitlab.freedesktop.org/mesa/mesa/-/blob/24.0/src/intel/vulkan/anv_formats.c?ref_type=heads#L1239-1243

Expected behavior

Per VUID-VkPhysicalDeviceImageFormatInfo2-tiling-02249:

tiling must be VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT if and only if the pNext chain includes VkPhysicalDeviceImageDrmFormatModifierInfoEXT

Valid Usage ID

Violation of VUID-VkPhysicalDeviceImageFormatInfo2-tiling-02249 (unreported since the violation is in the validation layer itself)

Additional context

GDB output ```sh hread 18 "QThread" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffbaa006c0 (LWP 16233)] anv_get_image_format_properties (physical_device=physical_device@entry=0x7fff78e44740, info=info@entry=0x7fffba9fe040, pImageFormatProperties=pImageFormatProperties@entry=0x7fffba9fe080, pYcbcrImageFormatProperties=0x0, from_wsi=false) at ../src/intel/vulkan/anv_formats.c:1243 Downloading source file /usr/src/debug/mesa-24.0.4-1.fc40.x86_64/redhat-linux-build/../src/intel/vulkan/anv_formats.c 1243 isl_mod_info = isl_drm_modifier_get_info(vk_mod_info->drmFormatModifier); (gdb) info locals vk_mod_info = 0x0 format_feature_flags = maxExtent = maxMipLevels = maxArraySize = sampleCounts = devinfo = 0x7fff78e45968 format = 0x7fffaf612160 isl_mod_info = 0x0 format_list_info = maxResourceSize = (gdb) bt #0 anv_get_image_format_properties (physical_device=physical_device@entry=0x7fff78e44740, info=info@entry=0x7fffba9fe040, pImageFormatProperties=pImageFormatProperties@entry=0x7fffba9fe080, pYcbcrImageFormatProperties=0x0, from_wsi=false) at ../src/intel/vulkan/anv_formats.c:1243 #1 0x00007fffaee98508 in anv_GetPhysicalDeviceImageFormatProperties2 (physicalDevice=0x7fff78e44740, base_info=0x7fffba9fe040, base_props=0x7fffba9fe070) at ../src/intel/vulkan/anv_formats.c:1644 #2 0x00007fff8e2d7800 in DispatchGetPhysicalDeviceImageFormatProperties2KHR () at /usr/src/debug/vulkan-validation-layers-1.3.275.0-1.fc40.x86_64/layers/vulkan/generated/layer_chassis_dispatch.cpp:3579 #3 CoreChecks::HasExternalMemoryImportSupport () at /usr/src/debug/vulkan-validation-layers-1.3.275.0-1.fc40.x86_64/layers/core_checks/cc_device_memory.cpp:228 #4 0x00007fff8e30f096 in CoreChecks::ValidateBindImageMemory () at /usr/src/debug/vulkan-validation-layers-1.3.275.0-1.fc40.x86_64/layers/core_checks/cc_device_memory.cpp:1540 #5 0x00007fff8e3115ac in CoreChecks::PreCallValidateBindImageMemory () at /usr/src/debug/vulkan-validation-layers-1.3.275.0-1.fc40.x86_64/layers/core_checks/cc_device_memory.cpp:1747 #6 0x00007fff8e86ddb3 in vulkan_layer_chassis::BindImageMemory () at /usr/src/debug/vulkan-validation-layers-1.3.275.0-1.fc40.x86_64/layers/vulkan/generated/chassis.cpp:1578 #7 0x00007ffff773c9c5 in vk_tex_create (gpu=0x7fff794e3e90, params=0x7fffba9feac0) at ../src/vulkan/gpu_tex.c:520 #8 0x00000000004b9bb5 in pl_map_avframe_drm (gpu=0x7fff794e3e90, out=0x7fffba9feef0, frame=0x7fff79949a80) at /usr/include/libplacebo/utils/libav_internal.h:999 #9 0x00000000004b9d03 in pl_map_avframe_derived (gpu=0x7fff794e3e90, out=0x7fffba9feef0, frame=0x7fff796d2f00) at /usr/include/libplacebo/utils/libav_internal.h:1037 #10 0x00000000004ba62b in pl_map_avframe_ex (gpu=0x7fff794e3e90, out=0x7fffba9feef0, params=0x7fffba9feeb0) at /usr/include/libplacebo/utils/libav_internal.h:1219 ```
spencer-lunarg commented 5 months ago

Thanks for digging up all this information, will take a look today/tomorrow

I realize we really should have added a VVL regression test when acting the DRM change before, but back then I didn't have enough knowledge of DRM Format Modifier or a Mesa build to test with

spencer-lunarg commented 4 months ago

@cgutman sorry for the delay

spent some time on this and I think the issue might be that Mesa and Spec don't match up, the spec VU is

tiling must be VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT if and only if the pNext chain includes VkPhysicalDeviceImageDrmFormatModifierInfoEXT

but not that if tiling is DRM_FORMAT_MODIFIER, a pNext struct must be there

spencer-lunarg commented 4 months ago

also it seems the fix was already patched in https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/7318/files

spencer-lunarg commented 4 months ago

closing as I'm still convinced https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull/7318 fixed this, happy to reopen and confirmed it did not (but I can't reproduce this issue with ToT)