wcandillon / react-native-webgpu

React Native implementation of WebGPU using Dawn
MIT License
402 stars 18 forks source link

`await GPUDevice.lost` hangs indefinitely if called before `GPUDevice.destroy()` #126

Open hmallen99 opened 3 weeks ago

hmallen99 commented 3 weeks ago

Hello! I'm super interested in adopting react-native-webgpu for an app I'm working on. Since the library is quite new, I've been trying to run the gpuweb compliance test suite against the library to gather an inventory of implementation gaps and gotchas, but there have been a couple of blockers.

The major blocker for running the compliance test suite is that some unresolved promises hang the main thread and cannot be cancelled. For example, this code (split out from this test) should time out after 1 second, but it hangs the app instead (the panresponder becomes unresponsive):

const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

const timeoutPromise = new Promise((_resolve, reject) => {
    const handle = setTimeout(() => {
      reject(new Error('timeout'));
    }, 1000);
});

// device.lost never resolves or rejects, hanging the app
await Promise.race([device.lost, timeoutPromise]);

It seems like the issue occurs when converting from std::future to a JS Promise in the RNFJSCIConverter. The code blocks the main thread on sharedFuture->wait();, which never resolves for device.lost, since device.destroy has not been called. The original author of the JSI converter code has since updated it to add some additional async handling. Would the ThreadPool approach used there make sense for this library?

I would be happy to submit a PR if that approach seems appropriate to you. There are a handful of other cases in the compliance test suite that this would resolve. If you're interested in getting the inventory from the compliance test suite, I'd be happy to file more issues as I find them and submit fixes! The other issues I've noticed result from the Dawn version trailing the latest Chromium version, but I can file a separate issue.

Thanks for the consideration and creating this awesome library!

wcandillon commented 3 weeks ago

Thank you for opening this. This is very exciting.

I am impressed that you were able to run CTS on top of this module. We have high interest in getting compliance numbers. I would love to learn more on how you do this and if it's something we can do in this repo.

You are making a good catch with the async handling. When I wrote it, I knew this problem would occur and now it is indeed time to fix it. I would be happy to accept a PR regarding this. I do plan to migrate completely to Nitro modules but this might be a longer task.