bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
36.4k stars 3.59k forks source link

Add a bindless mode to `AsBindGroup`. #16368

Open pcwalton opened 1 week ago

pcwalton commented 1 week ago

This patch adds the infrastructure necessary for Bevy to support bindless resources, by adding a new #[bindless] attribute to AsBindGroup.

Classically, only a single texture (or sampler, or buffer) can be attached to each shader binding. This means that switching materials requires breaking a batch and issuing a new drawcall, even if the mesh is otherwise identical. This adds significant overhead not only in the driver but also in wgpu, as switching bind groups increases the amount of validation work that wgpu must do.

Bindless resources are the typical solution to this problem. Instead of switching bindings between each texture, the renderer instead supplies a large array of all textures in the scene up front, and the material contains an index into that array. This pattern is repeated for buffers and samplers as well. The renderer now no longer needs to switch binding descriptor sets while drawing the scene.

Unfortunately, as things currently stand, this approach won't quite work for Bevy. Two aspects of wgpu conspire to make this ideal approach unacceptably slow:

  1. In the DX12 backend, all binding arrays (bindless resources) must have a constant size declared in the shader, and all textures in an array must be bound to actual textures. Changing the size requires a recompile.

  2. Changing even one texture incurs revalidation of all textures, a process that takes time that's linear in the total size of the binding array.

This means that declaring a large array of textures big enough to encompass the entire scene is presently unacceptably slow. For example, if you declare 4096 textures, then wgpu will have to revalidate all 4096 textures if even a single one changes. This process can take multiple frames.

To work around this problem, this PR groups bindless resources into small slabs and maintains a free list for each. The size of each slab for the bindless arrays associated with a material is specified via the #[bindless(N)] attribute. For instance, consider the following declaration:

#[derive(AsBindGroup)]
#[bindless(16)]
struct MyMaterial {
    #[buffer(0)]
    color: Vec4,
    #[texture(1)]
    #[sampler(2)]
    diffuse: Handle<Image>,
}

The #[bindless(N)] attribute specifies that, if bindless arrays are supported on the current platform, each resource becomes a binding array of N instances of that resource. So, for MyMaterial above, the color attribute is exposed to the shader as binding_array<vec4<f32>, 16>, the diffuse texture is exposed to the shader as binding_array<texture_2d<f32>, 16>, and the diffuse sampler is exposed to the shader as binding_array<sampler, 16>. Inside the material's vertex and fragment shaders, the applicable index is available via the material_bind_group_slot field of the Mesh structure. So, for instance, you can access the current color like so:

// `uniform` binding arrays are a non-sequitur, so `uniform` is automatically promoted
// to `storage` in bindless mode.
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
    let color = material_color[mesh[in.instance_index].material_bind_group_slot];
    ...
}

Note that portable shader code can't guarantee that the current platform supports bindless textures. Indeed, bindless mode is only available in Vulkan and DX12. The BINDLESS shader definition is available for your use to determine whether you're on a bindless platform or not. Thus a portable version of the shader above would look like:

#ifdef BINDLESS
@group(2) @binding(0) var<storage> material_color: binding_array<Color, 4>;
#else // BINDLESS
@group(2) @binding(0) var<uniform> material_color: Color;
#endif // BINDLESS
...
@fragment
fn fragment(in: VertexOutput) -> @location(0) vec4<f32> {
#ifdef BINDLESS
    let color = material_color[mesh[in.instance_index].material_bind_group_slot];
#else // BINDLESS
    let color = material_color;
#endif // BINDLESS
    ...
}

Importantly, this PR doesn't update StandardMaterial to be bindless. So, for example, scene_viewer will currently not run any faster. I intend to update StandardMaterial to use bindless mode in a follow-up patch.

A new example, shaders/shader_material_bindless, has been added to demonstrate how to use this new feature.

Here's a Tracy profile of submit_graph_commands of this patch and an additional patch (not submitted yet) that makes StandardMaterial use bindless. Red is those patches; yellow is main. The scene was Bistro Exterior with a hack that forces all textures to opaque. You can see a 1.47x mean speedup. Screenshot 2024-11-12 161713

Migration Guide

github-actions[bot] commented 1 week ago

The generated examples/README.md is out of sync with the example metadata in Cargo.toml or the example readme template. Please run cargo run -p build-templated-pages -- update examples to update it, and commit the file change.