Closed lmichaelis closed 1 month ago
Hi, @lmichaelis !
This is actually quite horrible bug to deal with...
So, basically VK_ERROR_OUT_OF_DATE_KHR
means that:
A surface has changed in such a way that it is no longer compatible with the swapchain, and further presentation requests using the swapchain will fail. Applications must query the new surface properties and recreate their swapchain if they wish to continue presenting to the surface.
This case is not possible on windows (due to how WSI works here), but on X11, where window can change state asynchronously, from what UI thread observes, this apparently the case. Basically engine have to retry swaphain creation, until some different error code received.
Whatever the case, it shouldn't hang indefinitely when it fails swapchain creation.
Apparently vkAcquireNextImageKHR
failed to finish, but still set fence in waiting state, while associated work wasn't issued.
The Vulkan driver should not be problem, since vkcube runs without issue
Should not be representative. AFAIR vkcube, as most Vulkan-educational apps do ad-hook initialization of window/swapchain. Application code is way more complex (also thx AMD here for unordered acquire - this is why semaphore/fence spaghetti is around :) )
I do have an Optimus capable laptop with an integrated and a dedicated GPU. Maybe Tempest is selecting the wrong one?
Should not mater - Tempest doesn't do auto-select, game does. OpenGothic gives priority to dedicated GPU (see main.cpp).
"DRM kernel driver 'nvidia-drm' in use. NVK requires nouveau."
Not familiar with that message; google tells that it related to opensource driver, not to proprietary
Did vkAcquireNextImageKHR
return something to &id
? If so, maybe error code can be ignored in case of constructor.
Did
vkAcquireNextImageKHR
return something to&id
? If so, maybe error code can be ignored in case of constructor.
Yes indeed it returns a value like this: 4294967295
4294967295
, aka uint32_t(-1)
is not a value. correct result should be in range of 0..2
- this is Id of back-buffer image
Today actually realized, that I was slightly wrong on hang reason:
vkWaitForFences(device.device.impl,1,&f,VK_TRUE,std::numeric_limits<uint64_t>::max()); // wait for any already issues workload
vkResetFences(device.device.impl,1,&f); // clear VkFence to non-signaled state
uint32_t id = uint32_t(-1);
VkResult code = vkAcquireNextImageKHR(device.device.impl,
swapChain,
std::numeric_limits<uint64_t>::max(),
slot.acquire,
f, // fence should be in pending state, after successful aquire
&id);
if(ignoreSuboptimal && code==VK_SUBOPTIMAL_KHR)
code = VK_SUCCESS;
Basically, in case of error code fence goes into inconsistent state, when it wait for something that is not issued.
I've pushed my solution to hang. Unfortunately there is no good options in vanilla-vulkan, only to recreate fence entirely.
Can you please check, if it helps with hang? Thanks!
Yes it does, thanks!
In 2733ed2 I've added a swap-chain creation loop, that should be able to handle X11 shenanigans.
Have had to wrap vkAcquireNextImage
only for sake of debugging, - unfortunately this is only was for me to reproduce such on widows.
Amazing! Now the crash is gone and OpenGothic now works correctly. Thank you!
Out of curiosity: how many attempts it takes, for the engine, to allocate swapchain? Relevant code:
void VSwapchain::createSwapchain(VDevice& device) {
for(uint32_t attempt=0; ; ++attempt) {
Out of curiosity: how many attempts it takes, for the engine, to allocate swapchain?
The function gets called multiple times and it takes between 1 and 4 attempts to perform the operation successfully (i.e. reaching the break
) during the first call. Subsequent calls always acquire the swapchain immediately. How many attempts it takes depends on how fast the code runs: When I debug it, it takes 1 to 2 attempts but when I printf
the result like so it takes up to 4.
Interesting too is this: If I change the GPU selection to "integrated only", the swapchain can always be created immediately.
Heyo, I've been having problems running OpenGothic on Fedora 40 with an NVIDIA GeForce RTX 3060 Laptop GPU (proprietary drivers, version 550.78). On startup, swapchain creation
https://github.com/Try/Tempest/blob/e85b2926e170e3290d9a56dd9f10dd32ac5b8cf0/Engine/gapi/vulkan/vswapchain.cpp#L62
fails here,
https://github.com/Try/Tempest/blob/e85b2926e170e3290d9a56dd9f10dd32ac5b8cf0/Engine/gapi/vulkan/vswapchain.cpp#L309-L310
throwing a
DeviceLostException
which is caught here,https://github.com/Try/Tempest/blob/e85b2926e170e3290d9a56dd9f10dd32ac5b8cf0/Engine/gapi/vulkan/vswapchain.cpp#L64-L66
leading to swapchain cleanup. It then tries to wait for a fence here,
https://github.com/Try/Tempest/blob/e85b2926e170e3290d9a56dd9f10dd32ac5b8cf0/Engine/gapi/vulkan/vswapchain.cpp#L76
which hangs indefinitly. Now, I don't know why it would fail during initialization in the first place but here are some ideas:
vkcube
runs without issuevkcube
works while getting the same message this seems unrelated (?)Whatever the case, it shouldn't hang indefinitely when it fails swapchain creation.