Closed cwfitzgerald closed 3 months ago
I ran the poll tests locally and this is what I'm seeing:
DX12 on Windows 11 with Intel iGPU, Nvidia dGPU and WARP errors with
[2022-11-10T11:04:34Z ERROR wgpu_hal::auxil::dxgi::exception] ID3D12Resource2::<final-release>: CORRUPTION: An ID3D12Resource object (0x00000204A3036F50:'Unnamed Object') is referenced by GPU operations in-flight on Command Queue (0x000002049A790500:'Unnamed ID3D12CommandQueue Object'). It is not safe to final-release objects that may have GPU operations pending. This can result in application instability. [ EXECUTION ERROR #921: OBJECT_DELETED_WHILE_STILL_IN_USE]
Vulkan on Windows 11 with Intel iGPU and Nvidia dGPU passes
Vulkan on Linux with llvmpipe and swiftshader segfaults
OpenGL on Linux with llvmpipe and Intel iGPU passes
In #3174 the CI seems to have the same issues (segfaults in linux/vulkan and errors with OBJECT_DELETED_WHILE_STILL_IN_USE on windows/dx12). So, finding and fixing the underlying bug might also fix that test case.
It would be nice to see if this still occurs after arcanization.
I tested it and it does.
https://github.com/gfx-rs/wgpu/pull/3873 seems to have removed all the .skip(FailureCase::always())
from the poll tests.
I tried running the poll tests on the parent commit and they failed.
I can't tell why they are passing with https://github.com/gfx-rs/wgpu/pull/3873, I don't see any changes in wgpu-core
.
@cwfitzgerald do you know why they are working now?
Ah, I see what happened, https://github.com/gfx-rs/wgpu/pull/3873 added a DummyWorkData
struct that contains the CommandBuffer
resources so that they are not dropped before the CommandBuffer
.
I bisected this being fixed by https://github.com/gfx-rs/wgpu/commit/aade481bdf7f8f9ae18423bf9f0dc1279844f37e (https://github.com/gfx-rs/wgpu/pull/4894). More specifically by this being removed:
//Releasing safely unused resources to decrement refcount
bind_group.used_buffer_ranges.write().clear();
bind_group.used_texture_ranges.write().clear();
bind_group.dynamic_binding_info.write().clear();
Poll tests (https://github.com/gfx-rs/wgpu/blob/master/wgpu/tests/poll.rs) are currently completely disabled due to bugs in resource tracking. I believe the problem stems from the resources being completely dropped by the time the device is maintained.