Open duplexsystem opened 1 year ago
fp16 also historically has had longer usage than pascal and RDNA2, since usage as an input/output format has been allowed/existed for much longer. Half for example, has been a type in HLSL in DX9 nearly 15 years ago. My understanding though from previous discussions on this topic, is that it's rust itself that is limiting fp16, as there's been a lot of discussion of adding it to rust, and there are limited architectures that know about bf16 and f16 on the CPU side of things. Crates like half aren't ideal because you'd need them to support spir-v as a backend? I'm not sure the issues there. Still, it's a bit of a strained argument given inline SPIR-V in rust-gpu. I don't see why a temporary solution can't be put forward so peoples code doesn't arbitrarily end up being straight up not compatible with rust-gpu in the mean time.
Crates like half aren't ideal because you'd need them to support spir-v as a backend? I'm not sure the issues there.
Half would be a good solution but it lacks native spirv fp16 support, although ironically has support for using spirv some spirv extensions to emulating fp16 more perfomrant, but I was saying inline asm isn't ideal because presumably, espically for things like input and output, it would be super clumsy and likely not integrate well with existing rust-gpu syntax. I should have made it more clear my, proposed, stop gap solution is to provide intrinsics for fp16 which could be used by half, or directly by a shader, allowing you to, use fp16 without asm, and in the future if rust ever gets fp16 support it could be piped into this. This would be very similar to how rust exposes CPU fp16 intrinsics for different arches in core. But as far as I can tell this wouldn't really solve storing a fp16 in a vector. But I digress, I just feel like fp16 is a relatively important feature rust-gpu lacks and wanted to open the discussion on it (or be told its not planned or out of scope)
There is a RFC to add f16
to Rust:
https://rust-lang.github.io/rfcs/3453-f16-and-f128.html
16bit Floating point support has been a feature of GPUs since Nvidia's Pascal, AMD's RDNA2, and Adreno GPUs starting at some point (couldn't find out when). Copied from this WebGPU suggestion (because it's a good summary) fp16 provide the following, quite large benefits over fp32:
In addition fp16 support has been a, optional, core feature since Vulkan 1.2, see VK_KHR_shader_float16_int8.
Sadly rust currently does not have an fp16 type, however there are crates such as half which emulate fp16 and optionally compile down to native types/intrinsics when possible. However possibly other than directly writing inline SPIR-V, and even that may cause unintended side effects as to my knowledge rust-gpu was not written with fp16 support in mind, rust-gpu does not support fp16. Given glam also does not support fp16 ecosystem integration would not be trivial. However the performance gain and flexibility of allowing fp16 are quite large and fp16 would expand the amount of shaders that could be ported to rust in a performant way (see FSR 1.0, FSR 2.0, and NIS among others).
All that to preface: Is support for 16bit Floating point planned/in scope for rust-gpu, be that directly support or just exposing intrinsics and supporting 16bit storage?