Is your feature request related to a problem? Please describe.
Performance on desktops for complex scenes is worse than it should be when targeting Vulkan.
There are performance problems reported by multiple users, even for simple scenes, but these may be unrelated.
A sizable fraction of the main-thread CPU time is going into re-creating bind groups on every frame. This is pure overhead - most bind groups are the same from frame to frame.
Describe the solution you'd like
The popular game dev solution for this in Vulkan land now seem to be going with bindless operation.
Moving WGPU toward bindless operation was looked at back in 2019-2020. At that time, it seemed that bindless under Vulkan wasn't ready.
It is now. Vulkan implementations have caught up. It's an extension, but apparently the major platforms now all have it. Even Android has update-after-bind now.
The basic concept of bindless operation is that there's one big array that indexes all the textures. The subscript into that array identifies the texture. Now there's the headache of managing that array. There are several approaches. The highest performance one seems to have one big array, adding and removing entries on the fly. This has to be interlocked so that no entry the GPU is looking at is ever altered. This requires a slot allocator that does interlocking at the array element level.
An implication is that bindless index management has to be at the same level as safety. This complicates the problem, because WGPU is supposed to be safe, while not managing memory in a fine-grained way. Not sure how this can be fitted into WGPU's API. If the application manages that array, the interface will become unsafe.
Vulkano faces the same problem. How do they do it? I'll try to find out.
This probably has to be done eventually for WGPU to remain competitive.
Describe alternatives you've considered
Switch from WGPU to Vulkano or Ash.
Try to get the existing binding code in WGPU sped up. It used to be faster back in 0.20, but apparently there were safety problems.
Is your feature request related to a problem? Please describe. Performance on desktops for complex scenes is worse than it should be when targeting Vulkan. There are performance problems reported by multiple users, even for simple scenes, but these may be unrelated. A sizable fraction of the main-thread CPU time is going into re-creating bind groups on every frame. This is pure overhead - most bind groups are the same from frame to frame.
Describe the solution you'd like The popular game dev solution for this in Vulkan land now seem to be going with bindless operation. Moving WGPU toward bindless operation was looked at back in 2019-2020. At that time, it seemed that bindless under Vulkan wasn't ready.
It is now. Vulkan implementations have caught up. It's an extension, but apparently the major platforms now all have it. Even Android has update-after-bind now.
The basic concept of bindless operation is that there's one big array that indexes all the textures. The subscript into that array identifies the texture. Now there's the headache of managing that array. There are several approaches. The highest performance one seems to have one big array, adding and removing entries on the fly. This has to be interlocked so that no entry the GPU is looking at is ever altered. This requires a slot allocator that does interlocking at the array element level.
Here's a discussion from r/vulkan, with links to C++ code that implements this approach.
An implication is that bindless index management has to be at the same level as safety. This complicates the problem, because WGPU is supposed to be safe, while not managing memory in a fine-grained way. Not sure how this can be fitted into WGPU's API. If the application manages that array, the interface will become unsafe.
Vulkano faces the same problem. How do they do it? I'll try to find out.
This probably has to be done eventually for WGPU to remain competitive.
Describe alternatives you've considered