baldurk / renderdoc

RenderDoc is a stand-alone graphics debugging tool.
https://renderdoc.org
MIT License
9.02k stars 1.35k forks source link

Crash after closing two captures on OpenGL #2906

Closed w-pearson closed 1 year ago

w-pearson commented 1 year ago

Description

RenderDoc crashes if you create and then close two OpenGL captures. Here's the stacktrace:

Exception thrown at 0x00007FFE2DF9345F (igxelpicd64.dll) in qrenderdoc.exe: 0xC0000005: Access violation reading location 0x0000029BC809DCB0.
    igxelpicd64.dll!00007ffe2dd0635e()  Unknown
    igxelpicd64.dll!00007ffe2df9346f()  Unknown
    igxelpicd64.dll!00007ffe2df9a9f9()  Unknown
    igxelpicd64.dll!00007ffe2df9b15f()  Unknown
    igxelpicd64.dll!00007ffe2e76f31f()  Unknown
    igxelpicd64.dll!00007ffe2dd2f095()  Unknown
    igxelpicd64.dll!00007ffe2df9b52c()  Unknown
    igxelpicd64.dll!00007ffe2e02fb71()  Unknown
    igxelpicd64.dll!00007ffe2dd6baec()  Unknown
    opengl32.dll!00007ffe8c9d1984() Unknown
>   renderdoc.dll!WGLPlatform::DeleteClonedContext(GLWindowingData context) Line 145    C++
    renderdoc.dll!ContextShareGroup::~ContextShareGroup() Line 95   C++
    renderdoc.dll!ContextShareGroup::`scalar deleting destructor'(unsigned int) C++
    renderdoc.dll!WrappedOpenGL::UnregisterReplayContext(GLWindowingData windata) Line 1306 C++
    renderdoc.dll!GLReplay::CloseReplayContext() Line 3827  C++
    renderdoc.dll!GLReplay::Shutdown() Line 86  C++
    renderdoc.dll!ReplayController::Shutdown() Line 1901    C++
    qrenderdoc.exe!ReplayManager::run(int proxyRenderer, const QString & capturefile, const ReplayOptions & opts, std::function<void __cdecl(float)> progress) Line 556 C++
    qrenderdoc.exe!ReplayManager::OpenCapture::__l2::<lambda>() Line 59 C++
    qrenderdoc.exe!std::_Invoker_functor::_Call<void <lambda>(void) & __ptr64>(ReplayManager::OpenCapture::__l2::void <lambda>(void) & _Obj) Line 1377  C++
    qrenderdoc.exe!std::invoke<void <lambda>(void) & __ptr64>(ReplayManager::OpenCapture::__l2::void <lambda>(void) & _Obj) Line 1445   C++
    qrenderdoc.exe!std::_Invoke_ret<void,void <lambda>(void) & __ptr64>(std::_Forced<void,1> __formal, ReplayManager::OpenCapture::__l2::void <lambda>(void) & <_Vals_0>) Line 1462 C++
    qrenderdoc.exe!std::_Func_impl<void <lambda>(void),std::allocator<int>,void>::_Do_call() Line 214   C++
    qrenderdoc.exe!std::_Func_class<void>::operator()() Line 280    C++
    qrenderdoc.exe!LambdaThread::process() Line 511 C++
    qrenderdoc.exe!QtPrivate::FunctorCall<QtPrivate::IndexesList<>,QtPrivate::List<>,void,void (__cdecl LambdaThread::*)(void) __ptr64>::call(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 136   C++
    qrenderdoc.exe!QtPrivate::FunctionPointer<void (__cdecl LambdaThread::*)(void) __ptr64>::call<QtPrivate::List<>,void>(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 170   C++
    qrenderdoc.exe!QtPrivate::QSlotObject<void (__cdecl LambdaThread::*)(void) __ptr64,QtPrivate::List<>,void>::impl(int which, QtPrivate::QSlotObjectBase * this_, QObject * r, void * * a, bool * ret) Line 121   C++
    Qt5Core.dll!000000006d7050e1()  Unknown
    Qt5Core.dll!000000006d551f2f()  Unknown
    Qt5Core.dll!000000006d556cd1()  Unknown
    kernel32.dll!00007ffef1777614() Unknown
    ntdll.dll!00007ffef35026a1()    Unknown

The actual invalid pointer varies, but it seems that the crash is 100% consistent (at least on my machine with an Intel GPU).

Steps to reproduce

I ran into this using the RenderDoc demos. I'm not sure if it affects other applications or if something about how the demos work triggers it.

  1. Launch demos
  2. Select OGL_Simple_Triangle
  3. Capture
  4. Close demos (the capture will automatically open)
  5. Launch demos again
  6. Select no when prompted to save the capture
  7. Select OGL_Simple_Triangle
  8. Capture
  9. Close demos (the capture will automatically open)
  10. Launch demos again
  11. Select no when prompted to save the capture
  12. RenderDoc crashes

This is OGL-specific. If, at step 7, you instead launch VK_Simple_Triangle, Renderdoc will not crash; however, if you later capture OGL_Simple_Triangle again, Renderdoc will crash. Also, OGL_Simple_Triangle isn't required; OGL_Draw_Zoo also causes it.

Alternatively:

  1. Launch demos
  2. Select OGL_Simple_Triangle
  3. Capture
  4. Capture again
  5. Close demos
  6. Open first capture
  7. Open the second capture
  8. Open the first capture again
  9. RenderDoc crashes

A third method that confirms that it's the closing that does it:

  1. Launch demos
  2. Select OGL_Simple_Triangle
  3. Capture
  4. Close demos (the capture will automatically open)
  5. Select File → Close Capture
  6. Select no when prompted to save the capture
  7. Launch demos again
  8. Select OGL_Simple_Triangle
  9. Capture
  10. Close demos (the capture will automatically open)
  11. Select File → Close Capture
  12. Select no when prompted to save the capture
  13. RenderDoc crashes

Environment

baldurk commented 1 year ago

I can reproduce this as well, and all indications point to this being a driver bug. The crash is inside the driver, and everything RenderDoc is doing looks fine to me. It's crashing on a random context deletion but only on the second iteration which doesn't make much sense, especially as GL should not be crashing even for invalid calls.

The context management seems fine to me and it works on other implementations - one of the auto tests loads captures repeatedly to check for issues like this or memory leaks, so if it were a general bug I would expect to hit it elsewhere.

w-pearson commented 1 year ago

That seems reasonable to me. I did notice once oddity though if I set a breakpoint at DeleteClonedContext, though:

  1. Launch demos
  2. Select OGL_Simple_Triangle
  3. Capture
  4. Close demos (the capture will automatically open)
  5. The breakpoint will be hit (from DoVendorChecks)
  6. Select File → Close Capture
  7. Select no when prompted to save the capture
  8. The breakpoint will be hit (from ~ContextShareGroup)
  9. Launch demos again
  10. Select OGL_Simple_Triangle
  11. Capture
  12. Close demos (the capture will automatically open)
  13. This time, the breakpoint will not be hit
  14. Select File → Close Capture
  15. Select no when prompted to save the capture
  16. The breakpoint will be hit (from ~ContextShareGroup)
  17. RenderDoc crashes

I'm not sure whether it's expected that DoVendorChecks is only called once or if it should be called before each capture is loaded. But it looks like there are 3 context deletions overall.

Have you reported this to Intel already, or should I do it?

baldurk commented 1 year ago

Yes it's deliberate, DoVendorChecks only needs to be once to detect GL driver quirks/bugs. Once that's saved it's not needed again.

I don't have any contact at Intel for GL bugs so I haven't reported it.

w-pearson commented 1 year ago

I submitted a report at IGCIT/Intel-GPU-Community-Issue-Tracker-IGCIT#287.