linebender / vello

An experimental GPU compute-centric 2D renderer.
http://linebender.org/vello/
Apache License 2.0
2.26k stars 126 forks source link

DX12 validation fixes #139

Closed raphlinus closed 1 year ago

raphlinus commented 2 years ago

This is something of a followup to #95. Also, #138 runs reasonably portably but is showing some problems on DX12. Since the logic is good enough to run on Vulkan and Metal, I'm making a separate issue for the DX12-specific issues, most of which are at the HAL level. Overall even on DX12 it runs in release (at least on my hardware), but prints validation warnings and hangs in debug mode.

One problem is validation warnings. Actually a sub-issue is that detailed validation warnings are available when run under Visual Studio, but less so when just run on the command line. It should be possible to capture and print those too, probably just missing the proper intercept calls. These validation warnings show that resources are in the incorrect state. A lot of that is due to binding buffers either as UAV or SRV depending on whether the shader source marks them as "readonly." The simplest workaround is to always bind as UAV, for which there is a spirv-cross flag (--hlsl-force-storage-buffer-as-uav). In addition, some of the transfer commands seem to put the buffer in particular states. Again, the simplest thing is to add transition barriers to take them back to common. As future work, we might do more fine-grained tracking and auto-generate the more precise barriers. I think asking users of the HAL to write precise barriers themselves is probably asking too much.

A deeper and more serious problem is coarse.comp hanging. I've started to investigate this but don't have a full answer. That shader has several loops to consume the full binned input. Crudely breaking three of those loops makes the shader terminate (with incomplete output, but the rest of the pipeline runs). It's not obvious to me where the problem; it's possible the input is wrong, making this stage fail to terminate. It's possible the input is right but there is a logic error, but one that only manifests as hanging under shader validation. Or it's possible it's correct but shader validation is buggy. In any case, this is the issue to track the failure.

DJMcNab commented 1 year ago

See #95. We still might have DX12 issues with wgpu, but they will largely be different to the issues experienced with piet-gpu-hal.