webgpu-native / webgpu-headers

**NOT STABLE YET!** See README.
https://webgpu-native.github.io/webgpu-headers/
BSD 3-Clause "New" or "Revised" License
387 stars 45 forks source link

Clarify the behavior of `wgpuInstanceWaitAny` called from a callback #360

Open eliemichel opened 1 month ago

eliemichel commented 1 month ago

Question about WGPUFutures: Is it legal to call wgpuInstanceWaitAny from within a callback, and if so, what completion status is reported for the WGPUFuture of the callback we are currently in?

(EDIT: parent issue #199)


FTR, replies got on matrix:

From @Kangz IMHO it should work, but it might be difficult to implement

From @kainino0x I think it should work, it's implementable in theory but YMMV with actual implementations The (now outdated, but mostly correct) design doc says that the completed field on your first WaitAny should have already been set before the callback is called. However at least at that time, I left the answer to your question undefined: https://docs.google.com/document/d/1qJRTJRY318ZEqhK6ryw4P-91FumYQfOnNP6LpANYy6A/edit#bookmark=id.8zqwnqgij106 Until recently Dawn would actually assert-crash in debug mode if you did that, but we took it out

kainino0x commented 3 days ago

Thinking on this further, I think it's necessary to make this safe. Sure, we could say it's unsafe to WaitAny(f1) during f1's own callback inside another WaitAny(f1).

If those things have to be safe then nested WaitAny may as well have to be re-entrancy-safe too (for both single-threaded nesting and for multithreading).

kainino0x commented 3 days ago

If those things have to be safe

OTOH they don't necessarily have to be safe. There just would be no way to do them safely.

kainino0x commented 3 days ago

I formalized the less safe version in #416. This might be too pessimistic though.

cc @lokokung

lokokung commented 2 days ago

FWIW, I think WaitAny inside a callback should work in our current implementation... (I haven't explicitly tried it, but I do remember ensuring that we release any Instance related locks before calling callbacks which should in turn make it fine.)

However, we still have issues with re-entrancy for device level stuff because of the default global Device lock... If/when we have a separate thread for doing stuff, I think we can defer certain spontaneous callbacks, i.e. Device lost, and that might allow re-entrancy? Would need to think a little more about whether that could work though.

kainino0x commented 2 days ago

Ah, I didn't think about the fact that there can be things that are safe in multithreaded code but will cause deadlocks in nested code....