gfx-rs / wgpu

A cross-platform, safe, pure-Rust graphics API.
https://wgpu.rs
Apache License 2.0
12.31k stars 906 forks source link

Validation of QuerySet and CommandEncoder::resolve_query_set() Has Holes #3612

Open simonask opened 1 year ago

simonask commented 1 year ago

I was trying to benchmark some compute shaders (Vulkan) and ran into a deadlock in Bevy.

The deadlock turned out to be due to a call to vkWaitForFences never returning while holding the exclusive device lock.

The Vulkan validation layer indicated some buffer mapping errors that weren't very obviously related (but are).

The root cause of the issue turned out to be my combination of write_timestamp and resolve_query_set. You see, the query_range parameter to resolve_query_set really matters. Providing a range that covers parts of the query set that weren't actually written to by write_timestamp (and I assue other types of queries as well) implies that a memory fence is created for each location, which then never resolves. In other word, the hardware is waiting for an event that never happens.

All of this makes sense and is not a bug in wgpu. I would even say that it is expected once you see how it works. I would just have saved a lot of time if there had been a comment in the documentation for CommandEncoder::resolve_query_set() mentioning that you really need to keep track of the indices in the query set that are being written to, and provide the appropriate range(s) to resolve_query_set, and nothing more.

Pseudo-repro:

let query_set = device.create_query_set(wgpu::QuerySetDescriptor {
    ty: wgpu::QueryType::Timestamp,
    count: 16,
    ...
});

let timestamps_buffer = device.create_buffer(wgpu::BufferDescriptor {
    usage: wgpu::BufferUsages::COPY_DST | wgpu::BufferUsages::MAP_READ,
    size: 16 * size_of::<u64>(),
    ...
});

let mut encoder = device.create_command_encoder(...);
encoder.write_timestamp(&query_set, 0);
// Do some work
encoder.write_timestamp(&query_set, 1);

encoder.resolve_query_set(
    &query_set,
    0..16, // XXX: Only wrote 2 timestamps, this will hang! See below.
    &timestamps_buffer,
    0,
);

// Note: It doesn't hang here.
queue.submit(Some(encoder.finish()));

In my case, the hang was observed when Bevy tried to acquire a surface texture for rendering. Something in that codepath has a call to vkWaitForFences(), which makes sense. But it just took me a while to narrow it down to the call to resolve_query_set.

cwfitzgerald commented 1 year ago

Thank you for filling! We really need to cover this with more tests.

All of this makes sense and is not a bug in wgpu. I

Disagree, validation should catch this and cause an error cpu side instead of just deadlocking. While deadlocking is allowed it is stupid hard to debug, and we should improve this experience as much as possible.