baldurk / renderdoc

RenderDoc is a stand-alone graphics debugging tool.
https://renderdoc.org
MIT License
9.02k stars 1.35k forks source link

Assertion failure inside `vk_common_CmdBeginRenderPass2` when closing Renderdoc after capture #2883

Closed Aaron1011 closed 1 year ago

Aaron1011 commented 1 year ago

Description

I'm attempting to use Renderdoc to debug the Ruffle flash emulator (https://github.com/ruffle-rs/ruffle). I'm able to press F12 to capture a frame - however, after closing the application (Ruffle) and seeing the 'Loading capture' progress bar complete, Renderdoc segfaults about 5 seconds later.

I can reproduce this segfault both under my native Nvidia linux driver, and under mesa lavapipe. Using debug builds of both Renderdoc and Mesa, I captured the following backtrace from gdb:

gef➤  bt
#0  vk_common_CmdBeginRenderPass2 (commandBuffer=0x7fec84447e10, pRenderPassBeginInfo=0x7fedde7fa590, pSubpassBeginInfo=<optimized out>) at ../mesa-23.0.0/src/vulkan/runtime/vk_render_pass.c:2195
#1  0x00007feddd8d1dbd in vk_common_CmdBeginRenderPass (commandBuffer=<optimized out>, pRenderPassBegin=<optimized out>, contents=<optimized out>) at ../mesa-23.0.0/src/vulkan/runtime/vk_render_pass.c:257
#2  0x00007fee7afbc637 in VulkanDebugManager::FillWithDiscardPattern (this=0x7fed98c1fe50, cmd=0x7fecd20f1390, type=DiscardType::RenderPassStore, image=0x55b6f15794b0, curLayout=VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL, 
    discardRange=..., discardRect=...) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_debug.cpp:2066
#3  0x00007fee7b54f58c in WrappedVulkan::Serialise_vkCmdEndRenderPass<ReadSerialiser> (this=0x7fed980090e0, ser=..., commandBuffer=0x7fecd20f1390) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/wrappers/vk_cmd_funcs.cpp:2299
#4  0x00007fee7af5ddb9 in WrappedVulkan::ProcessChunk (this=0x7fed980090e0, ser=..., chunk=VulkanChunk::vkCmdEndRenderPass) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:3392
#5  0x00007fee7af5d41a in WrappedVulkan::ContextProcessChunk (this=0x7fed980090e0, ser=..., chunk=VulkanChunk::vkCmdEndRenderPass) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:3257
#6  0x00007fee7af5be1d in WrappedVulkan::ContextReplayLog (this=0x7fed980090e0, readType=CaptureState::ActiveReplaying, startEventID=0x1, endEventID=0x2db, partial=0x0)
    at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:2968
#7  0x00007fee7af609cb in WrappedVulkan::ReplayLog (this=0x7fed980090e0, startEventID=0x1, endEventID=0x2dc, replayType=eReplay_WithoutDraw) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_core.cpp:3959
#8  0x00007fee7b0f68a0 in VulkanReplay::ReplayLog (this=0x7fed9800cf40, endEventID=0x2dc, replayType=eReplay_WithoutDraw) at /home/aaron/repos/renderdoc/renderdoc/driver/vulkan/vk_replay.cpp:210
#9  0x00007fee7bda91b6 in ReplayController::SetFrameEvent (this=0x7fed98002a70, eventId=0x2dc, force=0x1) at /home/aaron/repos/renderdoc/renderdoc/replay/replay_controller.cpp:78
#10 0x000055b6eff1d69b in operator() (__closure=0x55b6f2369a40, r=0x7fed98002a70) at ../../qrenderdoc/Code/CaptureContext.cpp:1608
#11 0x000055b6eff316fc in std::__invoke_impl<void, CaptureContext::SetEventID(const rdcarray<ICaptureViewer*>&, uint32_t, uint32_t, bool)::<lambda(IReplayController*)>&, IReplayController*>(std::__invoke_other, struct {...} &) (__f=...)
    at /usr/include/c++/12.2.1/bits/invoke.h:61
#12 0x000055b6eff2e26b in std::__invoke_r<void, CaptureContext::SetEventID(const rdcarray<ICaptureViewer*>&, uint32_t, uint32_t, bool)::<lambda(IReplayController*)>&, IReplayController*>(struct {...} &) (__fn=...)
    at /usr/include/c++/12.2.1/bits/invoke.h:154
#13 0x000055b6eff2a24e in std::_Function_handler<void(IReplayController*), CaptureContext::SetEventID(const rdcarray<ICaptureViewer*>&, uint32_t, uint32_t, bool)::<lambda(IReplayController*)> >::_M_invoke(const std::_Any_data &, IReplayController *&&) (__functor=..., __args#0=@0x7fedde7fb970: 0x7fed98002a70) at /usr/include/c++/12.2.1/bits/std_function.h:290
#14 0x000055b6eff04d9b in std::function<void (IReplayController*)>::operator()(IReplayController*) const (this=0x55b6f26ad598, __args#0=0x7fed98002a70) at /usr/include/c++/12.2.1/bits/std_function.h:591
#15 0x000055b6efeffc81 in ReplayManager::run(int, QString const&, ReplayOptions const&, std::function<void (float)>) (this=0x7ffc60eed880, proxyRenderer=0xffffffff, capturefile=..., opts=..., progress=...)
    at ../../qrenderdoc/Code/ReplayManager.cpp:495
#16 0x000055b6efefcc58 in operator() (__closure=0x7fee08001530) at ../../qrenderdoc/Code/ReplayManager.cpp:58
#17 0x000055b6eff02eac in std::__invoke_impl<void, ReplayManager::OpenCapture(const QString&, const ReplayOptions&, RENDERDOC_ProgressCallback)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...)
    at /usr/include/c++/12.2.1/bits/invoke.h:61
#18 0x000055b6eff018de in std::__invoke_r<void, ReplayManager::OpenCapture(const QString&, const ReplayOptions&, RENDERDOC_ProgressCallback)::<lambda()>&>(struct {...} &) (__fn=...) at /usr/include/c++/12.2.1/bits/invoke.h:154
#19 0x000055b6eff00a4f in std::_Function_handler<void(), ReplayManager::OpenCapture(const QString&, const ReplayOptions&, RENDERDOC_ProgressCallback)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /usr/include/c++/12.2.1/bits/std_function.h:290
#20 0x000055b6efdeffcc in std::function<void ()>::operator()() const (this=0x7fee08004590) at /usr/include/c++/12.2.1/bits/std_function.h:591
#21 0x000055b6eff041d5 in LambdaThread::process (this=0x7fee08004580) at ../../qrenderdoc/Code/QRDUtils.h:510
#22 0x000055b6eff059f4 in QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, void (LambdaThread::*)()>::call(void (LambdaThread::*)(), LambdaThread*, void**) (
    f=(void (LambdaThread::*)(LambdaThread * const)) 0x55b6eff04196 <LambdaThread::process()>, o=0x7fee08004580, arg=0x7fedde7fbd60) at /usr/include/qt/QtCore/qobjectdefs_impl.h:152
#23 0x000055b6eff05711 in QtPrivate::FunctionPointer<void (LambdaThread::*)()>::call<QtPrivate::List<>, void>(void (LambdaThread::*)(), LambdaThread*, void**) (
    f=(void (LambdaThread::*)(LambdaThread * const)) 0x55b6eff04196 <LambdaThread::process()>, o=0x7fee08004580, arg=0x7fedde7fbd60) at /usr/include/qt/QtCore/qobjectdefs_impl.h:185
#24 0x000055b6eff052cb in QtPrivate::QSlotObject<void (LambdaThread::*)(), QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase*, QObject*, void**, bool*) (which=0x1, this_=0x7fee08006b60, r=0x7fee08004580, a=0x7fedde7fbd60, 
    ret=0x0) at /usr/include/qt/QtCore/qobjectdefs_impl.h:418
#25 0x00007fee796bea71 in ?? () from /usr/lib/libQt5Core.so.5
#26 0x00007fee794e243f in QThread::started(QThread::QPrivateSignal) () from /usr/lib/libQt5Core.so.5
#27 0x00007fee794e4311 in ?? () from /usr/lib/libQt5Core.so.5
#28 0x00007fee78e9ebb5 in ?? () from /usr/lib/libc.so.6
#29 0x00007fee78f20d90 in ?? () from /usr/lib/libc.so.6

The relevant line from Mesa is: https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.0/src/vulkan/runtime/vk_render_pass.c#L2195

assert(pass->attachment_count == framebuffer->attachment_count);

Steps to reproduce

The capture file can be found at: https://drive.google.com/file/d/1ym48yNSEZx2Ge-GraQXlAIq-t0OCUFCC/view?usp=share_link

Unfortunately, reproducing this is a somewhat involved process:

  1. Clone https://github.com/Aaron1011/ruffle and checkout branch fpa-remix (commit ece1ceceb667702b908de6aaba807b2f2f181760)
  2. Follow the 'Building from Source' guide in Ruffle: https://github.com/ruffle-rs/ruffle#building-from-source
  3. In the root of the clone of ruffle, run cargo build --release -p ruffle_desktop
  4. Download fpa-world-1-remix_temp.swf from https://drive.google.com/file/d/1qcfscEPAZV49fBj1drsH-ZX-fBcdzYtw/view?usp=share_link
  5. Launch Renderdoc with VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json, or the equivalent path to the LavaPipe ICD. This is not strictly necessary (I was able to observe a crash with my native Nvidia driver), but it should let you reproduce my exact crash with debuginfo. Start a capture with the following settings:
    • Executable Path: /path-to-ruffle-clone/target/release/ruffle_desktop
    • Command-line Arguments: <path to 'fpa-world-1-remix_temp.swf'>
      1. Click 'Launch', and then click 'Ok' on the 'Ruffle - Unsupported Content' dialog box that pops up
      2. A Ruffle window will open, and should progress through a couple of loading screen. Wait until you see a screen that looks like this: ruffle_fpa_main
  6. Press F12 to save a capture
  7. Save the capture, as it can be used to reliably crash RenderDoc
  8. Close Ruffle, and observe the RenderDoc segfault after the capture finishes loading

Environment

Please let me know if you have any questions - I'm happy to make modifications to Ruffle as needed to help debug this.

baldurk commented 1 year ago

I'm able to reproduce with just the capture (fortunately, as from experience rust can be a pain to set up). It looks like this is an issue with a multisampled stencil-only texture attachment, which isn't something I've encountered before so looks like a couple of bits of code aren't handling stencil-only as opposed to normal depth-stencil.

baldurk commented 1 year ago

I believe that commit should fix it. I had to hack around a driver bug to get the capture to load but it seemed to load and work fine afterwards. If that doesn't work for you and you can share a capture from your nvidia GPU I can look into it further.

Aaron1011 commented 1 year ago

Thank you for the quick fix!