Open ricejasonf opened 1 month ago
Hi, there is the "infinite" timeout for a purpose, it would be invalid to go further while the GPU is not finished with that frame which we are waiting on. Could you make sure that you have updated graphics drivers?
I did a full update and verified I have the latest driver, and I was able to get to freeze again immediately (loading scripts under "Content").
local/nvidia 550.78-7
NVIDIA drivers for linux
@ricejasonf Wicked recently updated the dxcompiler to the May version, and that seems to be broken on Linux (#856) and caused all kinds of weird issues on various graphics drivers. It has been reverted to the previous version, can you update to master and give it another try?
Sorry, but the problem still persists. It does not happen every time, but it still definitely freezes when loading a script.
Did you delete the shaders/spirv directory just to make sure no compiled shaders from the dxcompiler remain?
I deleted the entire build
directory. If that is where they are located, then yes. (I am on the Discord if that is easier for back and forth stuff.)
I can confirm that it is in fact getting stuck in that vkWaitForFences
call. Consider the following small alteration to the point of interest:
7247 while (true) {
7248 res = vkWaitForFences(device, 1, &frame_fence[bufferindex][queue],
7249 VK_TRUE, uint64_t{10000000000});
7250 if (res == VK_SUCCESS) break;
7251 assert(res == VK_SUCCESS);
7252 }
Attempting to reproduce the error results in hitting the assert after 10 seconds of blank screen.
WickedEngineEditor: /home/jason/Projects/WickedEngine/WickedEngine/wiGraphicsDevice_Vulkan.cpp:7251: virtual void wi::graphics::GraphicsDevice_Vulkan::SubmitCommandLists(): Assertion `res == VK_SUCCESS' failed.
Aborted (core dumped)
It would be nice to find the bug, but I think there is also an opportunity for graceful error handling here.
I realized that this is a duplicate of #804.
Can you confirm that the hang always happens when queue is 3 (QUEUE_VIDEO_DECODE)? And never with any other value?
I tried it several times and the value for queue
was consistently 3
. So, yes, that looks like the enum value for QUEUE_VIDEO_DECODE
as you stated.
When resizing the widget window for the entity component system, I can reproduce this very quickly just wagging it back and forth. Still always queue == 3
Hi, I am not certain that this is related to linux specifically, but when I load different "content" scripts in the editor sometimes the application hangs and sometimes it won't even respond to signals. (ie I have to
kill -9
the process.). I tried it in debug mode and found the problem point.The call to
vkWaitForFences
hangs. I am new to this api (and modern graphics in general), but I see that the timeout is very large. Is this the right way to handle "CPU stalling"? I think at least this could loop onVK_TIMEOUT
and use a reasonably small timeout (from what I have been googling). Also , here is the call stack from when I was able to stop the process:I will play with this more next week, but I thought I would wait for some feedback on the intent with the large timeout.
Thanks.
EDIT: It occurred to me that maybe it is stuck in some loop and it just happens to always break while the process is waiting on that line (7247).