gfx-rs / wgpu

A cross-platform, safe, pure-Rust graphics API.
https://wgpu.rs
Apache License 2.0
11.5k stars 857 forks source link

[core] deadlock between `adapter_request_device` and `device_poll` #5807

Open sagudev opened 3 weeks ago

sagudev commented 3 weeks ago

Running webgpu:api,validation,state,device_lost,destroy:queue,writeTexture,2d,uncompressed_format:* on https://github.com/sagudev/servo/commit/bb841f16edee6d92baf7638155b5d02d1806160a

adapter_request_device thread:

thread backtrace
  thread #90, name = 'WGPU'
    frame #0: 0x000071f5b652725d libc.so.6`syscall at syscall.S:38
    frame #1: 0x00005eb1f06178a9 servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers at linux.rs:112:13
    frame #2: 0x00005eb1f061788c servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers [inlined] <parking_lot_core::thread_parker::imp::ThreadParker as parking_lot_core::thread_parker::ThreadParkerT>::park at linux.rs:66:13
    frame #3: 0x00005eb1f061786a servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers at parking_lot.rs:635:36
    frame #4: 0x00005eb1f0617831 servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers at parking_lot.rs:207:5
    frame #5: 0x00005eb1f0617831 servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers at parking_lot.rs:600:5
    frame #6: 0x00005eb1f0617831 servo`parking_lot::raw_rwlock::RawRwLock::wait_for_readers(self=0x000071f598463138, timeout=Instant>{...}, prev_value=0) at raw_rwlock.rs:1017:17
    frame #7: 0x00005eb1f06152f1 servo`parking_lot::raw_rwlock::RawRwLock::lock_exclusive_slow(self=0x000071f598463138, timeout=Instant>{...}) at raw_rwlock.rs:647:9
    frame #8: 0x00005eb1ef6e7270 servo`wgpu_core::registry::FutureId<T>::assign [inlined] <parking_lot::raw_rwlock::RawRwLock as lock_api::rwlock::RawRwLock>::lock_exclusive(self=0x000071f598463138) at raw_rwlock.rs:73:26
    frame #9: 0x00005eb1ef6e7259 servo`wgpu_core::registry::FutureId<T>::assign at rwlock.rs:500:9
    frame #10: 0x00005eb1ef6e7259 servo`wgpu_core::registry::FutureId<T>::assign at vanilla.rs:85:33
    frame #11: 0x00005eb1ef6e7259 servo`wgpu_core::registry::FutureId<T>::assign(self=wgpu_core::registry::FutureId<wgpu_core::device::resource::Device<wgpu_hal::vulkan::Api>> @ 0x00005c1e398c5170, value=<unavailable>) at registry.rs:94:34
    frame #12: 0x00005eb1ef7d3271 servo`wgpu_core::instance::<impl wgpu_core::global::Global>::adapter_request_device(self=0x000071f598463010, adapter_id=<unavailable>, desc=<unavailable>, trace_path=Path>{...}, device_id_in=Some({...}), queue_id_in=Some({...})) at instance.rs:1112:34

device_poll thread:

thread backtrace
  thread #91, name = 'WGPU poller'
    frame #0: 0x000071f5b6524ded libc.so.6`__GI___ioctl(fd=14, request=3224397002) at ioctl.c:36:7
    frame #1: 0x000071f5b61bcb00 libdrm.so.2`drmIoctl + 48
    frame #2: 0x000071f5b61c137c libdrm.so.2`drmSyncobjTimelineWait + 76
    frame #3: 0x000071f515de989f libvulkan_radeon.so`___lldb_unnamed_symbol8568 + 671
    frame #4: 0x000071f515ddf165 libvulkan_radeon.so`___lldb_unnamed_symbol8493 + 85
    frame #5: 0x000071f515ddeb37 libvulkan_radeon.so`___lldb_unnamed_symbol8483 + 263
    frame #6: 0x00005eb1ef8ddf69 servo`wgpu_hal::vulkan::device::<impl wgpu_hal::Device for wgpu_hal::vulkan::Device>::wait at device.rs:670:9
    frame #7: 0x00005eb1ef8ddf54 servo`wgpu_hal::vulkan::device::<impl wgpu_hal::Device for wgpu_hal::vulkan::Device>::wait(self=<unavailable>, fence=<unavailable>, wait_value=<unavailable>, timeout_ms=<unavailable>) at device.rs:2047:41
    frame #8: 0x00005eb1ef7fc2ca servo`wgpu_core::device::resource::Device<A>::maintain(self=0x000071f310471010, fence_guard=wgpu_core::lock::vanilla::RwLockReadGuard<core::option::Option<wgpu_hal::vulkan::Fence>> @ r15, maintain=<unavailable>, snatch_guard=<unavailable>) at resource.rs:417:17
    frame #9: 0x00005eb1ef8b6f42 servo`wgpu_core::device::global::<impl wgpu_core::global::Global>::poll_all_devices at global.rs:2148:39
    frame #10: 0x00005eb1ef8b6f22 servo`wgpu_core::device::global::<impl wgpu_core::global::Global>::poll_all_devices at global.rs:2188:21
    frame #11: 0x00005eb1ef8b6d94 servo`wgpu_core::device::global::<impl wgpu_core::global::Global>::poll_all_devices(self=0x000071f598463010, force_wait=<unavailable>) at global.rs:2213:17
    frame #12: 0x00005eb1ef5eb783 servo`webgpu::poll_thread::poll_all_devices(global=<unavailable>, more_work=0x000071f50abfa04f, force_wait=<unavailable>, lock=()) at poll_thread.rs:57:11
sagudev commented 3 weeks ago

I think this is problem with self.shared.raw.wait_semaphores, so this might be fixed in latest wgpu versions.