PacktPublishing / Mastering-Graphics-Programming-with-Vulkan

MIT License
514 stars 65 forks source link

Chapters 4+ generate validation errors if multiple discrete GPUs installed #19

Closed sjbaines closed 1 year ago

sjbaines commented 1 year ago

Chapter 3 always selects the first GPU device. Chapter 4 onwards selects the LAST discrete GPU device, or the LAST integrated GPU device if no discrete. On my machine I have 2 GTX 1080s, so this code selects device 0 for chapter 3, and device 1 for all later chapters. There are no validation warnings for the chapter 3 code. For chapter 4: Call to vkGetPhysicalDeviceSurfaceFormatsKHR gives: MessageID: VUID-vkGetPhysicalDeviceSurfaceFormatsKHR-surface-06211 1886124171 Message: Validation Error : [VUID - vkGetPhysicalDeviceSurfaceFormatsKHR - surface - 06211] Object 0 : handle = 0x1a69c35dbd0, type = VK_OBJECT_TYPE_PHYSICAL_DEVICE; | MessageID = 0x706bf88b | vkGetPhysicalDeviceSurfaceFormatsKHR() : surface is not supported by the physicalDevice.The Vulkan spec states : surface must be supported by physicalDevice, as reported by vkGetPhysicalDeviceSurfaceSupportKHR or an equivalent platform - specific mechanism(https ://[vulkan.lunarg.com/doc/view/1.2.198.1/windows/1.2-extensions/vkspec.html#VUID-vkGetPhysicalDeviceSurfaceFormatsKHR-surface-06211]

And every frame yields: MessageID: VUID - vkQueuePresentKHR - pSwapchains - 01292 - 193100549 Message: Validation Error : [VUID - vkQueuePresentKHR - pSwapchains - 01292] Object 0 : handle = 0x628d6100000000f7, type = VK_OBJECT_TYPE_SWAPCHAIN_KHR; | MessageID = 0xf47d84fb | vkQueuePresentKHR: Presenting pSwapchains[0] image on queue that cannot present to this surface.The Vulkan spec states : Each element of pSwapchains member of pPresentInfo must be a swapchain that is created for a surface for which presentation is supported from queue as determined using a call to vkGetPhysicalDeviceSurfaceSupportKHR(https ://[vulkan.lunarg.com/doc/view/1.2.198.1/windows/1.2-extensions/vkspec.html#VUID-vkQueuePresentKHR-pSwapchains-01292]

If GpuDevice::init is modified to select the first discrete device then these errors go away. I don't know what changes are needed to solve this if using other than the first discrete device.

sjbaines commented 1 year ago

Looking through the closed issues, this is related, but different, to #8

theWatchmen commented 1 year ago

Thanks for reporting this. One possible fix would be to break after we find the first discrete GPU when we loop over all available devices. Could you test this on your side to see if it works? I don't have a system with multiple GPUs so I am not able to test that this would fix the problem.

sjbaines commented 1 year ago

I've submitted a pull request. I don't normally use Git so hope I haven't messed it up. If you are happy with the changes then they also need applying to all later chapters, but I wanted to keep this minimal.

theWatchmen commented 1 year ago

Hi Steve, thanks for the PR and apologies for the late reply. We'll take a look at this soon, we are preparing for a conference at the moment :)

sjbaines commented 1 year ago

No problem. I've just added a second commit which fixes this properly for chapter 15 - rather than just selecting the first device, it selects that first device that is verified as compatible with the display surface.

theWatchmen commented 1 year ago

Hey Steve, apologies for the late reply. I finally managed to get a look at this. I pushed an implementation for chapter 1 that integrates the changes you proposed in the PR.

Once you confirm it works, I'll port the changes to the other chapters.

theWatchmen commented 1 year ago

@sjbaines Did you get a chance to look at the changes I mentioned above? Appreciate you might be busy, so no worries if haven't got to it yet :) I might close this ticket if we don't get further updates though.

sjbaines commented 1 year ago

Hi, sorry for the slow response. I've just tested this, and it seems to work correctly. On my twin GTX1080 setup, GpuDevice::get_family_queue identifies device 0 as compatible with the display surface, and device 1 as not compatible, and so selects device 0. This then runs without any validation errors. If I force selection of device 1 in the debugger (which get_family_queue corrects identifies as NOT compatible), then code runs, but with per-frame validation errors. So, the device selection code is correctly identifying which devices are/are not compatible, and the reported issue is fixed.

theWatchmen commented 1 year ago

No worries and thanks for getting back to us! I ported the changes to all the other chapters and I am going to close this issue.