Error callback is only triggered once progam exits

1danielcoelho commented 1 year ago

Hello!

I'm playing around with basic C++ native WebGPU apps using Dawn (and Vulkan backend) following your guide, and I've set the error callbacks like this:

wgpuDeviceSetUncapturedErrorCallback(Globals::device, Impl::on_device_error, nullptr);

With a callback that looks like this:

void on_device_error(WGPUErrorType type, char const* message, void* pUserData)
{
    std::cout << "Uncaptured device error: type " << magic_enum::enum_name<WGPUErrorType>(type);
    if (message)
    {
        std::cout << " (" << message << ")";
    }
    std::cout << std::endl;
}

So basically exactly like it says here: https://eliemichel.github.io/LearnWebGPU/getting-started/the-device.html

From the text, I get the impression that these callbacks should fire as soon as there's any form of error, and I should be able to see them on the console while my app runs.

Unfortunately only when I close the app that it seems that these error messages are flushed and on_device_error called, which makes it very annoying. I'd like to see the errors streaming on the console while they happen and my app continues to run.

This happens even if I launch with a debugger and put a breakpoint within on_device_error, so it is likely something related to Dawn itself because other regular std::cout calls work fine and output as my app runs.

I've tried setting the DAWN_DEBUG_BREAK_ON_ERROR environment variable, but it doesn't seem to have any effect (and I can't even find that env var mentioned on Dawn source?)

I've had a look at the callstack when on_device_error is called (after my app exits) and it looks like this:

app.exe!Impl::on_device_error(WGPUErrorType type, const char * message, void * pUserData) Line 324 (e:\Projects\test\src\renderer\renderer.cpp:324)
app.exe!dawn::native::DeviceBase::HandleError::__l23::<lambda>() Line 571 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\Device.cpp:571)
app.exe!std::invoke<void <lambda>(void) &>(dawn::native::DeviceBase::HandleError::__l23::void <lambda>(void) & _Obj) Line 1565 (c:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\type_traits:1565)
app.exe!std::_Invoker_ret<void>::_Call<void <lambda>(void) &>(dawn::native::DeviceBase::HandleError::__l23::void <lambda>(void) & _Func) Line 674 (c:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\functional:674)
app.exe!std::_Func_impl_no_alloc<void <lambda>(void),void>::_Do_call() Line 834 (c:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\functional:834)
app.exe!std::_Func_class<void>::operator()() Line 875 (c:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\functional:875)
app.exe!dawn::native::`anonymous namespace'::GenericFunctionTask::HandleShutDownImpl() Line 30 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\CallbackTaskManager.cpp:30)
app.exe!dawn::native::CallbackTask::Execute() Line 44 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\CallbackTaskManager.cpp:44)
app.exe!dawn::native::CallbackTaskManager::Flush() Line 114 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\CallbackTaskManager.cpp:114)
app.exe!dawn::native::DeviceBase::FlushCallbackTaskQueue() Line 1854 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\Device.cpp:1854)
app.exe!dawn::native::DeviceBase::WillDropLastExternalRef() Line 325 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\Device.cpp:325)
app.exe!dawn::native::RefCountedWithExternalCount::APIRelease() Line 28 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-src\src\dawn\native\RefCountedWithExternalCount.cpp:28)
app.exe!dawn::native::NativeDeviceRelease(WGPUDeviceImpl * cSelf) Line 1069 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-build\gen\src\dawn\native\ProcTable.cpp:1069)
app.exe!wgpuDeviceRelease(WGPUDeviceImpl * device) Line 342 (e:\Projects\test\third_party\dawn\build\Win64\Debug\_deps\dawn-build\gen\src\dawn\dawn_proc.c:342)
app.exe!Renderer::release_renderer() Line 1221 (e:\Projects\test\src\renderer\renderer.cpp:1221)
app.exe!main(int __formal, char * * __formal) Line 294 (e:\Projects\test\src\main.cpp:294)
app.exe!invoke_main() Line 79 (d:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:79)
app.exe!__scrt_common_main_seh() Line 288 (d:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288)
app.exe!__scrt_common_main() Line 331 (d:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:331)
app.exe!mainCRTStartup(void * __formal) Line 17 (d:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp:17)

But I'm not sure if this is meant to be right or not.

Any ideas? It is supposed to be calling on_device_error in real time, right?

Another thing that could be related: On the following step the guide has us setup the wgpuQueueOnSubmittedWorkDone callback. I get an analogous issue there: I got the feeling I was supposed to be seeing the WorkDone callback called every frame or something like that, but I also only get its callback called exactly once, when the app finishes. And it's always called with the "DeviceLost" WGPUQueueWorkDoneStatus (because I closed the app, I imagine). That is pretty useless overall, so I wonder if I'm doing something wrong?

Thanks!

1danielcoelho commented 1 year ago

Ah, I've been looking into this and apparently we need to call wgpuDeviceTick: https://github.com/webgpu-native/webgpu-headers/issues/117

I've tested on my own code and it does work: I get the error callback as soon as Tick is called. It doesn't seem to block any GPU work on either webgpu-native or Dawn, so I guess it should be safe to use, at least for development/debug modes?

eliemichel commented 1 year ago

Yes wgpuDeviceTick is needed to flush callbacks in Dawn. Problem is that this is not standard, and wgpu-native has its own wgpuPoll to do something similar.

This issue is being discussed here because it would be better to have callbacks be called immediately when debugging. This is how we got the DAWN_DEBUG_BREAK_ON_ERROR although imperfect. Did you make sure you set it to =1 and that you run your program with a debugger (IDE or gdb, in a Debug build)?

eliemichel / LearnWebGPU-Code

Error callback is only triggered once progam exits #9