device.queue.read_buffer times out

wpmed92 commented 1 day ago

Describe the bug

device.queue.read_buffer times out

To Reproduce

git clone https://github.com/tinygrad/tinygrad.git
cd tinygrad
python3 -m pip install -e .
PYTHONPATH=. WEBGPU=1 python3 examples/stable_diffusion.py --seed 0 --steps 1

Output:

File "/Users/ahmedharmouche/Documents/tinygrad/tinygrad/runtime/ops_webgpu.py", line 56, in _copyout buffer_data = self.dev.queue.read_buffer(src, 0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_api.py", line 3427, in read_buffer tmp_buffer._map("READ_NOSYNC").sync_wait() File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_helpers.py", line 268, in sync_wait return self.finish() ^^^^^^^^^^^^^ File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_helpers.py", line 260, in finish raise RuntimeError(f"Waiting for {self.title} timed out.") RuntimeError: Waiting for buffer.map timed out. zsh: segmentation fault PYTHONPATH=. WEBGPU=1 python3 examples/stable_diffusion.py --seed 0 --steps 1

Observed behavior

When running tinygrad stable_diffusion.py, the buffer read times out when trying to get the output of the decode step. But it is not the buffer reading that takes that long, but to actually run the compute. Manually increasing the timeout from 5.0s here solves it, but in 0.18.1 this just worked (the timeout wasn't there?). Now, we have stable diffusion working on faster machines, but on my local computer, it times out, so I have to manually increase this timeout value, so for now we downgraded to 0.18.1. Can this timeout be increased/disabled?

Your environment

OS: MacOS Sonoma 14.4.1 Python version: 3.12 wgpu-py version: >=0.19.0 wgpu backend: Metal

almarklein commented 1 day ago

Hi Ahmed, thanks for this issue!

Ah, when I added the timeout, I thought it was as good idea in case for some reason wgpu-native did not fulfill the promise. I figured 5 s is plenty of time, but I had not considered it could simply be waiting for computations 🤦

I think it makes sense to remove the timeout altogether. Probably should get #631 merged first, because it touches the same code.

almarklein commented 1 day ago

cc @fyellin

wpmed92 commented 1 day ago

Thank you for the response @almarklein, once the timeout is removed, we'll bump the version back to latest.

pygfx / wgpu-py