canonical / mir

The Mir compositor
GNU General Public License v2.0
598 stars 96 forks source link

regression: crash in `mir::renderer::gl::Renderer::draw` when screenshotting on hybrid with Nvidia and `--display-config=single` #3431

Open Saviq opened 1 week ago

Saviq commented 1 week ago

This regressed in:

On a hybrid Nvidia / Intel system with two displays connected (one per GPU):

$ mir_demo_server --add-wayland-extensions=all --display-config=single &
$ glmark2-es2-wayland &
$ grim
#0  mir::renderer::gl::Renderer::draw (this=0x555555629ba0, renderable=...) at /usr/src/mir-2.17.0+dev113-g601aa2ba5c-0ubuntu24.04/src/renderers/gl/renderer.cpp:413
        texture = std::shared_ptr<mir::graphics::gl::Texture> (empty) = {get() = 0x0}
        clip_area = std::optional [no contained value]
        prog = <optimized out>
        rect = <optimized out>
        centrex = <optimized out>
        centrey = <optimized out>
        transform = {value = {{{x = 3.00976562, r = 3.00976562, s = 3.00976562}, {y = 0, g = 0, t = 0}, {z = 2.2958874e-41, b = 2.2958874e-41, p = 2.2958874e-41}, {w = 0, a = 0, q = 0}}, {{x = 9.96890384e+26, r = 9.96890384e+26, s = 9.96890384e+26}, {y = 4.59163468e-41, g = 4.59163468e-41, t = 4.59163468e-41}, {z = -2.53689518e+28, b = -2.53689518e+28, p = -2.53689518e+28}, {w = 4.59163468e-41, a = 4.59163468e-41, q = 4.59163468e-41}}, {{x = 1.55663708e+13, r = 1.55663708e+13, s = 1.55663708e+13}, {y = 3.0611365e-41, g = 3.0611365e-41, t = 3.0611365e-41}, {z = -2.53694217e+28, b = -2.53694217e+28, p = -2.53694217e+28}, {w = 4.59163468e-41, a = 4.59163468e-41, q = 4.59163468e-41}}, {{x = 2.2958874e-41, r = 2.2958874e-41, s = 2.2958874e-41}, {y = 0, g = 0, t = 0}, {z = -3.47259381e+23, b = -3.47259381e+23, p = -3.47259381e+23}, {w = 4.59163468e-41, a = 4.59163468e-41, q = 4.59163468e-41}}}}
#1  0x00007ffff7dfc001 in mir::renderer::gl::Renderer::render (this=0x555555629ba0, renderables=...) at /usr/src/mir-2.17.0+dev113-g601aa2ba5c-0ubuntu24.04/src/renderers/gl/renderer.cpp:366
        r = std::shared_ptr<mir::graphics::Renderable> (use count 1, weak count 0) = {get() = 0x7fff6c000d00}
        __for_range = <optimized out>
        __for_begin = <optimized out>
        __for_end = <optimized out>
        output = std::unique_ptr<mir::graphics::Framebuffer> = {get() = 0x0}
#2  0x00007ffff7d2e78b in mir::compositor::BasicScreenShooter::Self::render (area=..., buffer=warning: RTTI symbol not found for class 'std::_Sp_counted_deleter<mir::renderer::software::RWMappableBuffer*, mir::frontend::WlrScreencopyFrameV1::prepare_target(wl_resource*)::{lambda(auto:1*)#1}, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_deleter<mir::renderer::software::RWMappableBuffer*, mir::frontend::WlrScreencopyFrameV1::prepare_target(wl_resource*)::{lambda(auto:1*)#1}, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr<mir::renderer::software::WriteMappableBuffer> (use count 3, weak count 0) = {...}, this=0x55555614ecf0) at /usr/src/mir-2.17.0+dev113-g601aa2ba5c-0ubuntu24.04/src/server/compositor/basic_screen_shooter.cpp:189
        scene_elements = std::vector of length 0, capacity 1
        renderable_list = std::vector of length 1, capacity 1 = {std::shared_ptr<mir::graphics::Renderable> (use count 1, weak count 0) = {get() = 0x7fff6c000d00}}
        lock = <optimized out>
        captured_time = <optimized out>
        renderer = @0x555555629ba0: {_vptr.Renderer = 0x7ffff7ebd9f0 <vtable for mir::renderer::gl::Renderer+16>}
        lock = <optimized out>
        scene_elements = <optimized out>
        captured_time = <optimized out>
        renderable_list = <optimized out>
        renderer = <optimized out>
        element = <optimized out>
        __for_range = <optimized out>
        __for_begin = <optimized out>
        __for_end = <optimized out>

gdb.txt

mattkae commented 1 week ago

The trace suggests that we didn't have a texture at the time of render. Perhaps the client died after establishing an initial buffer?

Weirdly, when I test on my end, I expected glmark2-es2-wayland to allocate its buffer on my intel graphics card, but I think it chose Nvidia for whatever reason, which leads it to not render. It complains that:

Client requested unsupported format/modifier combination DRM_FORMAT_ARGB8888/NVIDIA:BLOCK_LINEAR_2D,HEIGHT=4,KIND=6,GEN=2,SECTO
....

Afterwards, glmark2 crashes. It might make sense then that this same thing is happening, as the Surface might be in the RenderableList, but the texture for that surface might not haven been populated just yet.

@RAOF: Would you know better what that error above means? I assume it means that I allocated on my Nvidia card unexpectedly.

Saviq commented 5 days ago

I wonder if the error mode depends on which output it selects… --display-platform-libs could probably help checking that.

Saviq commented 4 days ago

I wonder if the error mode depends on which output it selects…

As one could expect - it does. It's when it's rendering on Intel, but outputting on Nvidia that the crash happens.