Closed wpmed92 closed 1 week ago
inf is handled differently by different GPU vendors and backends. There might also be a case where compute shaders don't at all work with Dx12. Are you using Metal or Vulkan in your case?
I'm using Metal. It happens in Chrome as well. I worked around by passing in the inf as a uniform, and I use that uniform inf in computations. Works fine, but I hoped for a non-workaround solution. Seems like anytime the compiler can deduce that the computation produces inf, it flushes to zero.
Context: We're using wgpu
in tinygrad for our webgpu backend. It was removed from core, but now we're bringing it back, this is how I bumped into this issue.
I gave this a try on Vulkan and D3D12 you get the output to be zero. Reading the spec this is sorta expected behaviour because any intermedia results that are inf might become an "indeterminate" value ref. So it's not something that is meant to work and hence optimized away by the compiler. Your workaround using a uniform buffer might also not work across all GPUs... so I am not sure there will be a solution. Maybe you can look at the naga IR or gpu assembly to see where this gets optimized away and find a trick to avoid it? Perhaps there are other workarounds using push constants or compilation options?
Anyway, I doubt this is something to solve within wgpu-py
which just provides the python mapping to wgpu-native
C lib.
Thank you for looking into this! Regarding tinygrad I think the uniform based workaround is fine for now. Closing this as I agree it’s not something that can be solved within your lib.
Hi @wpmed92!
We ran into similar issues earlier, related to detecting inf and nan. We were able to find some tricks to fool the compiler so it won't optimize inf/nan away, so that we can implement our own isinf()
and isnan()
to detect nonfinite values in incoming data in Pygfx. We added tests here in wgpu-py so we know when out tricks start failing for whatever reason: https://github.com/pygfx/wgpu-py/blob/main/tests/test_not_finite.py
I'd think that similar tricks can be used to get inf values in your shader without using uniforms.
Another option can be the recently added overridable constants, which are a little lighter than uniform buffers.
additional reference for playing with the compiler https://shader-playground.timjones.io/ it does have naga, but doesn't show the IR I believe
Describe the bug
Compute shaders produce wrong result when multiplying infinity with a number, if that number is not known at compile time.
To Reproduce
Reproducible example: https://gist.github.com/wpmed92/c045e98fdb5916670c31383096706406
Observed behavior
The result of the repro compute shader is all zeroes, whereas it should be all infs. (positive number multiplied by positive infinity) If I modify the shader to just pass through
pos_inf
, the result is correct. If I modify the multiplication to multiply inf by a const, the result is also correct. The result is only incorrect, if I multiply by a variable that is not known at compile time.Your environment
Chip: Apple M1 Pro OS: 14.4.1 (23E224) Backend: Metal Python version: Python 3.8.10 wgpu version: wgpu 0.18.1