servo / servo

Servo, the embeddable, independent, memory-safe, modular, parallel web rendering engine
https://servo.org
Mozilla Public License 2.0
26.25k stars 2.92k forks source link

webgpu: Use WGPU poller thread for poll_all_devices #32266

Closed sagudev closed 2 weeks ago

sagudev commented 3 weeks ago

As discussed https://servo.zulipchat.com/#narrow/stream/263398-general/topic/ipc_channel, firefox will also switch to something similar in the future: https://bugzilla.mozilla.org/show_bug.cgi?id=1870699.

In future we could make thread per device, but that would require hashmap for Pollers.

Fastgame still works: https://sagudev.github.io/briefcase/fastgame.html

try run: https://github.com/sagudev/servo/actions/runs/9051091523/job/24867638272


sagudev commented 3 weeks ago

Timings are more sane now (we actually outperform firefox on onSubmittedWorkDone tests due to https://bugzilla.mozilla.org/show_bug.cgi?id=1870699): Servo run onSubmittedWorkDone tests in 350ms Firefox run onSubmittedWorkDone tests in 101535.0 ms = 1.7 min

EDIT: According to measurements on my computer we are also faster than chromium (edge) on onSubmittedWorkDone tests (tested edge vs. servo on my win11 machine).

sagudev commented 3 weeks ago

We are still missing poll somewhere because webgpu:api,operation,compute,basic:large_dispatch:* has flaky TIMEOUT (unless we poll every loop iteration, this could also be locking race in wgpu).

sagudev commented 3 weeks ago

I think there might be deadlock between poll and submit

EDIT: They are: https://github.com/gfx-rs/wgpu/issues/5687

sagudev commented 2 weeks ago

Some flakes that do occur but rarely (sometimes they are stable, other times they simply do not happen):

gterzian commented 2 weeks ago

(this is actually our fault as we do not handle this situation).

File an issue for this one? Seems like we can catch the destroy error and send an error back to script?

sagudev commented 2 weeks ago

(this is actually our fault as we do not handle this situation).

File an issue for this one? Seems like we can catch the destroy error and send an error back to script?

Done https://github.com/servo/servo/issues/32277