gfx-rs / wgpu

A cross-platform, safe, pure-Rust graphics API.
https://wgpu.rs
Apache License 2.0
12.13k stars 881 forks source link

Debug build of example Shadow leaks memory #6027

Open pakanek opened 1 month ago

pakanek commented 1 month ago

Description Wgpu example Shadow leaks memory in debug build.

Repro steps

$ cargo run --bin wgpu-examples shadow
# or
$ valgrind ./target/debug/wgpu-examples shadow

Expected vs observed behavior Debug build leaks ~1 MB/s on my system.

==14489== Warning: invalid file descriptor -1 in syscall close()
==14489==
==14489== HEAP SUMMARY:
==14489==     in use at exit: 537,263 bytes in 11,115 blocks
==14489==   total heap usage: 297,687 allocs, 286,572 frees, 209,850,754 bytes allocated
==14489==
==14489== LEAK SUMMARY:
==14489==    definitely lost: 264,260 bytes in 8,760 blocks
==14489==    indirectly lost: 0 bytes in 0 blocks
==14489==      possibly lost: 84 bytes in 1 blocks
==14489==    still reachable: 272,887 bytes in 2,353 blocks
==14489==         suppressed: 32 bytes in 1 blocks
==14489== Rerun with --leak-check=full to see details of leaked memory
==14489==
==14489== Use --track-origins=yes to see where uninitialised values come from
==14489== For lists of detected and suppressed errors, rerun with: -s
==14489== ERROR SUMMARY: 15578 errors from 13 contexts (suppressed: 0 from 0)

Extra materials valgrind ./target/release/wgpu-examples shadow 2>log-debug.txt (repeated lines from log replaced with "...") log-debug.txt log-release.txt

Platform Linux Fedora 40, kernel 6.9.9, x86_64 AMD Ryzen 5 7600 AMD Radeon RX 6500 XT wgpu: commit 6d7975eb3b443f6ecc9c81495abdd555eaf18eec

pakanek commented 1 month ago

Other examples which leaks too are: boids, cube, srgb_blend. Rest of the examples seems to be fine.

teoxoy commented 1 month ago

Can you pinpoint the leak to some area of the code?

pakanek commented 1 month ago

Would $ valgrind --leak-check=full target/debug/wgpu-examples shadow help?

==41140== Warning: invalid file descriptor -1 in syscall close()
==41140== 
==41140== HEAP SUMMARY:
==41140==     in use at exit: 956,097 bytes in 24,999 blocks
==41140==   total heap usage: 661,838 allocs, 636,839 frees, 277,890,198 bytes allocated
==41140== 
==41140== 84 bytes in 1 blocks are possibly lost in loss record 2,209 of 2,246
==41140==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==41140==    by 0x195CEBF: alloc (alloc.rs:98)
==41140==    by 0x195CEBF: alloc::alloc::Global::alloc_impl (alloc.rs:181)
==41140==    by 0x195D748: <alloc::alloc::Global as core::alloc::Allocator>::allocate (alloc.rs:241)
==41140==    by 0x195D847: hashbrown::raw::alloc::inner::do_alloc (alloc.rs:15)
==41140==    by 0x1953E72: hashbrown::raw::RawTableInner::new_uninitialized (mod.rs:1752)
==41140==    by 0x195429E: hashbrown::raw::RawTableInner::fallible_with_capacity (mod.rs:1790)
==41140==    by 0x1952D33: hashbrown::raw::RawTableInner::prepare_resize (mod.rs:2869)
==41140==    by 0x15571FF: resize_inner<alloc::alloc::Global> (mod.rs:3065)
==41140==    by 0x15571FF: reserve_rehash_inner<alloc::alloc::Global> (mod.rs:2955)
==41140==    by 0x15571FF: hashbrown::raw::RawTable<T,A>::reserve_rehash (mod.rs:1233)
==41140==    by 0x155DFEE: hashbrown::raw::RawTable<T,A>::reserve (mod.rs:1181)
==41140==    by 0x1561261: reserve<usize, usize, std::hash::random::RandomState, alloc::alloc::Global> (map.rs:1106)
==41140==    by 0x1561261: hashbrown::rustc_entry::<impl hashbrown::map::HashMap<K,V,S,A>>::rustc_entry (rustc_entry.rs:46)
==41140==    by 0x139F37B: std::collections::hash::map::HashMap<K,V,S>::entry (map.rs:853)
==41140==    by 0x1408954: wgpu_hal::gles::egl::initialize_display (egl.rs:446)
==41140== 
==41140== 52,836 bytes in 3,774 blocks are definitely lost in loss record 2,242 of 2,246
==41140==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==41140==    by 0x118682B0: ???
==41140==    by 0x11868B2E: ???
==41140==    by 0x15734CD: ash::extensions::ext::debug_utils::<impl ash::extensions_generated::ext::debug_utils::Device>::cmd_begin_debug_utils_label (debug_utils.rs:34)
==41140==    by 0x141FB9A: wgpu_hal::vulkan::command::<impl wgpu_hal::CommandEncoder for wgpu_hal::vulkan::CommandEncoder>::begin_debug_marker (command.rs:852)
==41140==    by 0xFCCC6E: wgpu_core::command::<impl wgpu_core::global::Global>::command_encoder_push_debug_group (mod.rs:681)
==41140==    by 0x11E4695: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::command_encoder_push_debug_group (wgpu_core.rs:2071)
==41140==    by 0xEE4A10: <T as wgpu::context::DynContext>::command_encoder_push_debug_group (context.rs:2883)
==41140==    by 0x1129765: wgpu::api::command_encoder::CommandEncoder::push_debug_group (command_encoder.rs:293)
==41140==    by 0x34DA47: <wgpu_examples::shadow::Example as wgpu_examples::framework::Example>::render (mod.rs:749)
==41140==    by 0x57F46B: wgpu_examples::framework::start::{{closure}}::{{closure}} (framework.rs:467)
==41140==    by 0x559EAF: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut (function.rs:294)
==41140== 
==41140== 86,802 bytes in 3,774 blocks are definitely lost in loss record 2,244 of 2,246
==41140==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==41140==    by 0x118682B0: ???
==41140==    by 0x11868B2E: ???
==41140==    by 0x15734CD: ash::extensions::ext::debug_utils::<impl ash::extensions_generated::ext::debug_utils::Device>::cmd_begin_debug_utils_label (debug_utils.rs:34)
==41140==    by 0x141FB9A: wgpu_hal::vulkan::command::<impl wgpu_hal::CommandEncoder for wgpu_hal::vulkan::CommandEncoder>::begin_debug_marker (command.rs:852)
==41140==    by 0xFCCC6E: wgpu_core::command::<impl wgpu_core::global::Global>::command_encoder_push_debug_group (mod.rs:681)
==41140==    by 0x11E4695: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::command_encoder_push_debug_group (wgpu_core.rs:2071)
==41140==    by 0xEE4A10: <T as wgpu::context::DynContext>::command_encoder_push_debug_group (context.rs:2883)
==41140==    by 0x1129765: wgpu::api::command_encoder::CommandEncoder::push_debug_group (command_encoder.rs:293)
==41140==    by 0x34DD5E: <wgpu_examples::shadow::Example as wgpu_examples::framework::Example>::render (mod.rs:798)
==41140==    by 0x57F46B: wgpu_examples::framework::start::{{closure}}::{{closure}} (framework.rs:467)
==41140==    by 0x559EAF: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut (function.rs:294)
==41140== 
==41140== 120,768 bytes in 7,548 blocks are definitely lost in loss record 2,245 of 2,246
==41140==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==41140==    by 0x118682B0: ???
==41140==    by 0x11868C45: ???
==41140==    by 0x157353D: ash::extensions::ext::debug_utils::<impl ash::extensions_generated::ext::debug_utils::Device>::cmd_insert_debug_utils_label (debug_utils.rs:50)
==41140==    by 0x141FA9A: wgpu_hal::vulkan::command::<impl wgpu_hal::CommandEncoder for wgpu_hal::vulkan::CommandEncoder>::insert_debug_marker (command.rs:845)
==41140==    by 0xFCD6E3: wgpu_core::command::<impl wgpu_core::global::Global>::command_encoder_insert_debug_marker (mod.rs:718)
==41140==    by 0x11E41E5: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::command_encoder_insert_debug_marker (wgpu_core.rs:2054)
==41140==    by 0xEE4950: <T as wgpu::context::DynContext>::command_encoder_insert_debug_marker (context.rs:2872)
==41140==    by 0x11296C5: wgpu::api::command_encoder::CommandEncoder::insert_debug_marker (command_encoder.rs:282)
==41140==    by 0x34E673: <wgpu_examples::shadow::Example as wgpu_examples::framework::Example>::render (mod.rs:766)
==41140==    by 0x57F46B: wgpu_examples::framework::start::{{closure}}::{{closure}} (framework.rs:467)
==41140==    by 0x559EAF: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut (function.rs:294)
==41140== 
==41140== 422,688 bytes in 7,548 blocks are definitely lost in loss record 2,246 of 2,246
==41140==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==41140==    by 0x118682B0: ???
==41140==    by 0x11868B2E: ???
==41140==    by 0x15734CD: ash::extensions::ext::debug_utils::<impl ash::extensions_generated::ext::debug_utils::Device>::cmd_begin_debug_utils_label (debug_utils.rs:34)
==41140==    by 0x141FB9A: wgpu_hal::vulkan::command::<impl wgpu_hal::CommandEncoder for wgpu_hal::vulkan::CommandEncoder>::begin_debug_marker (command.rs:852)
==41140==    by 0xFCCC6E: wgpu_core::command::<impl wgpu_core::global::Global>::command_encoder_push_debug_group (mod.rs:681)
==41140==    by 0x11E4695: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::command_encoder_push_debug_group (wgpu_core.rs:2071)
==41140==    by 0xEE4A10: <T as wgpu::context::DynContext>::command_encoder_push_debug_group (context.rs:2883)
==41140==    by 0x1129765: wgpu::api::command_encoder::CommandEncoder::push_debug_group (command_encoder.rs:293)
==41140==    by 0x34E5B1: <wgpu_examples::shadow::Example as wgpu_examples::framework::Example>::render (mod.rs:751)
==41140==    by 0x57F46B: wgpu_examples::framework::start::{{closure}}::{{closure}} (framework.rs:467)
==41140==    by 0x559EAF: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut (function.rs:294)
==41140== 
==41140== LEAK SUMMARY:
==41140==    definitely lost: 683,094 bytes in 22,644 blocks
==41140==    indirectly lost: 0 bytes in 0 blocks
==41140==      possibly lost: 84 bytes in 1 blocks
==41140==    still reachable: 272,887 bytes in 2,353 blocks
==41140==         suppressed: 32 bytes in 1 blocks
==41140== Reachable blocks (those to which a pointer was found) are not shown.
==41140== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==41140== 
==41140== Use --track-origins=yes to see where uninitialised values come from
==41140== For lists of detected and suppressed errors, rerun with: -s
==41140== ERROR SUMMARY: 27231 errors from 13 contexts (suppressed: 0 from 0)

Running other examples gives similar output: culprit seems to be wgpu_hal::CommandEncoder::begin_debug_marker and wgpu_hal::CommandEncoder::insert_debug_marker via wgpu::api::command_encoder::CommandEncoder::insert_debug_marker() and wgpu::api::command_encoder::CommandEncoder::push_debug_group(). Found in implementation of render method in case of the Shadow example.

I can post valgrind output from running other 3 examples, but I don't want to needlessly spam here.

teoxoy commented 1 month ago

Examples that are built with debug_assertions will turn on InstanceFlags::DEBUG which will enable the VK_EXT_debug_utils extension. From what I can see the leaks are happening in Mesa's VK_EXT_debug_utils implementation.

teoxoy commented 1 month ago

This MR looks relevant https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16655 but it seems they do free the labels.