Closed cnorthrop closed 5 months ago
We don't think that VUID should be checked without support for VK_EXT_swapchain_maintenance1
I guess from a spec perspective why is this the case? The VU was written and checked before VK_EXT_swapchain_maintenance1
was added, how would not having the extension supported effect the results of a vkGetPhysicalDeviceSurfaceCapabilitiesKHR
query?
We had an internal discussion recently about this minImageCount = 5
, and the final conclusion from Google engineer was "Yes, this does look like ANGLE bug".
Also there might be a question to the driver, since minImageCount=5 does not look like a practical limit. Recently I encountered this comment in the Granite vulkan renderer: https://github.com/Themaister/Granite/blob/495d8d2b78e39d8c8283447d2f68f0ad9ee297ee/vulkan/wsi.cpp#L1317
Per https://ci.chromium.org/ui/p/angle/builders/try/linux-exp-test/5/infra you should also be able to reproduce this on Linux Intel UHD 770 GPU with Mesa 23.2.1 and angle_perftests. E.g. [ RUN ] BlitFramebufferPerf.Run/vulkan_color_2_samples ../../src/tests/perf_tests/ANGLEPerfTest.h:131: Failure Failed Failing test because of unexpected error: vk_renderer.cpp:824 (DebugUtilsMessenger): [ VUID-VkSwapchainCreateInfoKHR-presentMode-02839 ] Validation Error: [ VUID-VkSwapchainCreateInfoKHR-presentMode-02839 ] | MessageID = 0xfb70ccdb | vkCreateSwapchainKHR(): pCreateInfo->minImageCount 3, which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR() (i.e. minImageCount = 5, maxImageCount = 0). The Vulkan spec states: If presentMode is not VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR nor VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR, then minImageCount must be greater than or equal to the value returned in the minImageCount member of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-presentMode-02839)
Command is ./src/tests/run_perf_tests.py --isolated-script-test-output=output.json --isolated-script-test-perf-output=perftest-output.json --log=debug --smoke-test-mode --show-test-stdout
We had an internal discussion recently about this minImageCount = 5, and the final conclusion from Google engineer was "Yes, this does look like ANGLE bug".
There are two separate issues:
VK_EXT_swapchain_maintenance1
is enabled, there was an ANGLE bug where we didn't query the minImageCount per-present-modeVK_EXT_swapchain_maintenance1
is disabled, there is a VVL bug where it expects the application to conform with per-present-mode limits, but the application may not know about that extension at allWe fixed the former issue in ANGLE. The latter is what this issue is about.
@cnorthrop @ShabbyX I created https://gitlab.khronos.org/vulkan/vulkan/-/merge_requests/6640
Talked to Artem, and I am still not sure as of right now where in the spec it says this VU matters for VK_EXT_swapchain_maintenance1
update, there is a chance I just really don't understand WSI at all and there is no need for a spec issue, @artem-lunarg is going to try and reproduce this and take it over for me
The spec has a note that hopefully makes things clearer here (see "Note"): https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VkSurfacePresentModeEXT
Just noticed something in our internal chat. The device where this is happening supports VK_EXT_surface_maintenance1, but not VK_EXT_swapchain_maintenance1. That's rather odd given how closely these two are related. So I just checked and ANGLE does enable surface-maint1 in this case... I have to think about whether we could conform to the per-present minImageCount or not (and whether the VU change you are making Spencer should be based on surface- or swapchain-maint1)
If the minImageCount part is independent of swapchain-maint1, then this was probably still an ANGLE bug after all :disappointed:
@ShabbyX The way I understood the original error message (with minCount = 5), is that large min count suggests that conservative computation took place, and that can happen when VVL works with compatibility group. If the application did not ask for VK_EXT_swapchain_maintenance1, then I guess the expectation that code path should not be taken by VVL?
We discussed yesterday with Spencer in which cases VVL can run that check. We found one VVL code path that uses pNext from VK_EXT_swapchain_maintenance1 without directly checking if that extension is enabled (in AcquireNextImage). But in that case (when pNext contains structure from VK_EXT_swapchain_maintenance1 and this extension is not enabled), the VVL stateless validation should complain earlier, so in the end this excluded the only suspicious case.
The way I understood the original error message (with minCount = 5), is that large min count suggests that conservative computation took place, and that can happen when VVL works with compatibility group.
I haven't seen the code, so I don't know. My guess was that VVL made a per-present-mode query (a surface-maint1 feature) to get the more exact minCount, while ANGLE made the generic query and got the inexact minCount. The Note under https://docs.vulkan.org/spec/latest/chapters/VK_KHR_surface/wsi.html#VkSurfacePresentModeEXT explains how the generic one may produce a smaller count than the per-present-mode one.
I haven't got a chance to dig into it yet, but my growing suspicion is that VVL was right to do that because ANGLE did enable surface_maint1, but didn't actually use it because swapchain_maint1 was missing.
Yes I can confirm this was an ANGLE bug, and adjusting the minImageCount if surface_maint1 is exposed (even if swapchain_maint1 isn't) is both doable and satisfies VVL.
Sorry we didn't notice this earlier.
@ShabbyX thanks for checking this. I'm going to cancel my plans for debugging session.
My guess was that VVL made a per-present-mode query (a surface-maint1 feature) to get the more exact minCount
Yes, this part:
VkSurfacePresentModeEXT present_mode_info = vku::InitStructHelper();
if (IsExtEnabled(device_extensions.vk_ext_surface_maintenance1)) {
present_mode_info.presentMode = create_info.presentMode;
present_mode_info.pNext = surface_info_pnext;
surface_info_pnext = &present_mode_info;
}
const auto surface_caps = surface_state->GetSurfaceCapabilities(physical_device_state->VkHandle(), surface_info_pnext);
I was confused about that large minCount value of 5 (and it's retrieved by the VVL, not app). I thought VVL does some generic query (because how can you get 5 from swapchain instead of 2 or 3). But the above code shows we do specific query due to enabled vk_ext_surface_maintenance1
. It has to be it's just a "lucky" present mode which requires a lot of images and the exact query probably does not improve much on the conservative generic case (conservative case is a maximum of specific cases, so there should be a specific case that matches a conservative one).
Environment:
Describe the Issue
We're seeing the following VVL error on platforms that don't support VK_EXT_swapchain_maintenance1:
vk_renderer.cpp:824 (DebugUtilsMessenger): [ VUID-VkSwapchainCreateInfoKHR-presentMode-02839 ] Validation Error: [ VUID-VkSwapchainCreateInfoKHR-presentMode-02839 ] | MessageID = 0xfb70ccdb | vkCreateSwapchainKHR(): pCreateInfo->minImageCount 3, which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR() (i.e. minImageCount = 5, maxImageCount = 0). The Vulkan spec states: If presentMode is not VK_PRESENT_MODE_SHARED_DEMAND_REFRESH_KHR nor VK_PRESENT_MODE_SHARED_CONTINUOUS_REFRESH_KHR, then minImageCount must be greater than or equal to the value returned in the minImageCount member of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-presentMode-02839)
Expected behavior
We don't think that VUID should be checked without support for
VK_EXT_swapchain_maintenance1
Repro
If you have a matching system, you can repro with any of ANGLE's trace tests.
Additionally we've been able to repro on AMD based systems by disabling the extension using the following: