Open dneto0 opened 5 years ago
If they're not supported in WebGPU then the developer will have to emulate them with SSBOs, except that the value-conversion (i.e. RGB9E5 to vec4) will be much slower as opposed to hardware.
Secondly there is an issue with the lack of support for Uniform Texel Buffers (TBO in OpenGL parlance), because in Vulkan an SSBO has to conform to the std430 layout, so supporting formats of less than 4 bytes per "pixel" (or rather sample) will be very painful.
This will be a bit hard to investigate as Vulkan Hardware database has no information about supported formats for Uniform and Storage Texel Buffers as well as the shader stages that support them.
because in Vulkan an SSBO has to conform to the std430 layout
Hopefully not for much longer! https://renderdoc.org/vkspec_chunked/chap41.html#VK_EXT_scalar_block_layout
@devshgraphicsprogramming good point. Btw, the stages at least seem to be well defined by the spec (and thus can be derived from the features in vulkandb).
Uniform texel buffers are exposed to all stages:
Load operations from uniform texel buffers are supported in all shader stages for image formats which report support for the VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT feature bit via vkGetPhysicalDeviceFormatProperties in VkFormatProperties::bufferFeatures.
Storage texel buffers are exposed to the same stages as SSBO/UAV in general:
When the fragmentStoresAndAtomics feature is enabled, stores and atomic operations are also supported for storage texel buffers in fragment shaders with the same set of texel buffer formats as supported in compute shaders. When the vertexPipelineStoresAndAtomics feature is enabled, stores and atomic operations are also supported in vertex, tessellation, and geometry shaders with the same set of texel buffer formats as supported in compute shaders.
Vulkan spec also has the list of formats that are required to support uniform/storage texel buffers in "Required Format Support".
Basically, uniform texel buffers are available for:
Storage texel buffers are available for:
Yeah so emulating R8,R8G8, R16, etc. with SSBOs would be a pain.
Basically always read the 32bits, then extract the bit range and optionally normalize. This would be a big perf-loss against native Vulkan or OpenGL.
Hopefully not for much longer! https://renderdoc.org/vkspec_chunked/chap41.html#VK_EXT_scalar_block_layout
Yes, that's the bleeding edge of support. Hopefully it's the last time Vulkan has to relax the layout rules!
The flip side is whether a big enough installed base can get that extension in time for WebGPU rollout.
Perf impact: Yes, unfavourably aligned vector accesses will be slower on some hardware, whether or not that extension is supported. Tradeoff is whether the user has to do the load-32-bits-then-unpack vs. the implementation. At least if the implementation has the responsibility, they have the opportunity of doing it better; and your shader code is more clear.
Tentatively closing. IIRC the group decided to not have these features and nobody asked for them (SSBOs mostly just better though there's some discussion about that in #297)
@Kangz this may become important for OpenGL backends. There are GL platforms that plain refuse to expose any SSBO stuff, but they'd be totally usable through TBOs.
Do these GL versions support compute shaders? I thought compute shaders always came with SSBOs.
Here is some light reading - https://www.raspberrypi.org/forums/viewtopic.php?t=271863 It talks about GLES-3.1, so it must have the compute stuff.
This is only for vertex shaders, so not quite as bad but this is a real constraint when targeting ES 3.1. I think it might be possible to magically transform VS using SSBO into a VS using a TBO (similar to the ByteAddressBuffer transform) and then back the TBO with the storage buffer. This would work for readonly-storage-buffer at least (there is no way to implement RW storage buffers in ES 3.1 AFAIK)
You can kind of emulate SSBOs using imageLoadStore, which I believe should be in ES 3.1
The Chrome team is getting multiple requests for texel buffers.
Motivations include:
Additionally, storage texture coordinates are limited.
WebGPU mandates supporting 1D and 2D coordinates up to 8192.
Texel buffers are not limited in this way. They would be limited by the size of the underlying GPUBuffer.
Moved to Milestone 2, per discussion in WebGPU API meeting 2023-09-27
because in Vulkan an SSBO has to conform to the std430 layout
Hopefully not for much longer! https://renderdoc.org/vkspec_chunked/chap41.html#VK_EXT_scalar_block_layout
can confirm that finally in 2023 all relevant (still supported) desktop platforms support scalar block layout, as for mobile... I cannot say
It would be nice to know how widely this feature is supported on our target hardware, to decide whether this would be a feature or not optional.
Concretely, what would this look like in the API and in WGSL?
The Chrome team is getting multiple requests for texel buffers.
Is a texel buffer a buffer that is backed by texture data, or a texture that is backed by buffer data?
If I'm not mistaken, Metal only supports the latter.
I believe that Metal texture_buffer
is the exact equivalent to texel buffers. The MSL specification says:
A texture buffer is a texture type that can access a large 1D array of pixel data and perform dynamic type conversion between pixel formats on that data with optimized performance. Texture buffers handle type conversion more efficiently than other techniques, allowing access to a larger element count, and handling out-of-bounds read access.
That's right, Metal texture buffers, MTLTextureTypeTextureBuffer
, are 1D textures which don't support mipmaps or texture arrays.
There are also buffer backed 2D textures which can be created from -[MTLBuffer newTextureWithDescriptor:]
In both cases, no support for mipmaps, array length is always 1, sample count is always 1, no support for compressed formats, maybe I am forgetting something.
That's right, Metal texture buffers,
MTLTextureTypeTextureBuffer
, are 1D textures which don't support mipmaps or texture arrays.
BTW, from the MSL spec in section 2.9.1 Texture Buffers, it says:
However, you cannot sample a texture buffer.
Ability to read and write to individual scalars from large arrays or rectangular grids of data.
Re: "rectangular grids," does Vulkan support that? It appears the way to create a uniform texel buffer or a storage texel buffer is by creating a VkBufferView
, but VkBufferViewCreateInfo
doesn't include any fields regarding the Y dimension. It appears only capable of creating 1-dimensional textures.
I think the use case of "be able to have different threads write to adjacent bytes without racing" is a compelling use case.
Exploit performance differences between memory types.
I was interested in characterizing this, so I wrote a little benchmark to see what the performance difference was in Metal.
The benchmark is straightforward:
Here are the performance results on an M2 MacBook Air:
Surprisingly, this shows that reading from the raw buffer is faster than reading from a texture buffer. If I were to guess, I'd bet the "Random Read Texture" and "Random Read Buffer" bars are different because memory accesses are fast enough that even the small amount of ALU to reverse the bits of the index isn't being hidden. Though this couldn't explain the entire difference, because the amount of ALU on both of the "Random Read" bars is the same, but the perf difference between them and their "Sequential Read" counterparts is not the same.
Just for fun, here are the results on an AMD Radeon PRO W6800X:
Here, when reading sequentially, the buffer is faster than the texture buffer, but when reading randomly, they're about the same. For the random reads, this is what I expected - I'd expect random accesses to defeat any difference in caching.
Anyhow, despite these disappointing results, I still think there are valid use cases beyond simply read performance in shaders, so adding texture buffers to WebGPU would still make sense.
Another interesting tidbit: in Vulkan, not every pixel format can be used as a uniform/storage texel buffer. And, the spec doesn't list which formats can be used (and which cannot be used). Instead, the spec says:
If Vulkan 1.3 is supported or the VK_KHR_format_feature_flags2 extension is supported, then the buffer view’s set of format features is the value of VkFormatProperties3::bufferFeatures found by calling vkGetPhysicalDeviceFormatProperties2 on the same format as VkBufferViewCreateInfo::format.
So, you have to ask at runtime whether or not the format is compatible with uniform/storage texel buffers, and different devices can return arbitrarily different answers for any given format.
(Aside: I have no idea how Vulkan native app developers could possibly use such an API effectively. What are you supposed to do if the device you're on just happens to not support the codepath you wrote? Write a codepath for literally every format? Fall back to buffers, and hope the device has StorageBuffer8BitAccess
?)
In DirectX, read-only buffer (Buffer
DirectX read-write buffers RWBuffer
UAV is used for read-write textures and buffers. The same buffer byte storage can support views of multiple buffer types, including different typed UAVs, SRVs and byteaddressbuffers. In DX11 structured buffer views are not compatible with other buffer views, since hardware is allowed to implement AoSoA swizzle internally for them. So their memory layout might not match linear buffer layout of other buffer types. Structured buffer functionality appears closest to Vulkan SSBOs, but structured buffers are also bound with opaque descriptor in DX11. In Vulkan SSAO bindings are just a 64 bit pointer to raw memory. 1D textures are not the same as texel buffers in DirectX. They could have different memory layout, and you can't create 1d texture view and typed buffer view to the same data.
Typed UAV Load (extended formats) support can be found in the following table: https://en.wikipedia.org/wiki/Feature_levels_in_Direct3D
Nvidia Fermi (GTX 500) and Kepler (GTX 600 and 700) don't suppor it and Intel Gen 7.5 (Haswell) and Gen 8 (Broadwell) don't support it. All other DX12 capable hardware supports it.
According to latest Steam HW survey Fermi GPUs and Broadwell GPUs are no longer used.
Kepler (GTX 700 series) total usage = 0.71% Intel Haswell usage = 0.40%
Total 1.11% Steam users with GPU that doesn't support Typed UAV Load (extended formats).
https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam
That it is an optional feature in D3D12 and Vulkan mean that this will need to be an extension. To find which format support that feature in Vulkan (and even D3D12), we can volunteer to add some stat gathering in Chromium to see what several options would give in terms of reach in the wild.
Uniform texel buffers permit loads on a buffer view, with value-conversion similar to image reads. See Vulkan's 13.1.5 Uniform Texel Buffer
Storage texel buffers permit loads and stores on a buffer view, with value-conversion similar to image reads and writes. See Vulkan's 13.1.6 Storage Texel Buffer
I don't have an opinion on whether these should be supported by WebGPU, and I didn't see an investigation related to this.