Closed akien-mga closed 4 years ago
I confirmed that the crash is related to the use of optimizations. With target=debug CCFLAGS="-O2"
(same optimization level as target=release_debug
), the crash is reproducible.
Got a couple more frames with -O1
, which still crashes:
(gdb) bt
#0 0x00007ffff78a6ce4 in __memmove_avx_unaligned_erms () from /lib64/libc.so.6
#1 0x00007ffff54c97ef in memcpy (__len=<optimized out>, __src=0x66f78fc, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:34
#2 radv_flush_constants (cmd_buffer=cmd_buffer@entry=0x66f7110, stages=<optimized out>, stages@entry=31) at ../src/amd/vulkan/radv_cmd_buffer.c:2410
#3 0x00007ffff54ca615 in radv_upload_graphics_shader_descriptors (cmd_buffer=cmd_buffer@entry=0x66f7110, pipeline_is_dirty=pipeline_is_dirty@entry=true) at ../src/amd/vulkan/radv_cmd_buffer.c:2651
#4 0x00007ffff54cd6e0 in radv_draw (cmd_buffer=0x66f7110, info=info@entry=0x7fffffffc0d0) at ../src/amd/vulkan/radv_cmd_buffer.c:4837
#5 0x00007ffff54cf69e in radv_CmdDrawIndexed (commandBuffer=<optimized out>, indexCount=<optimized out>, instanceCount=<optimized out>, firstIndex=<optimized out>, vertexOffset=<optimized out>,
firstInstance=<optimized out>) at ../src/amd/vulkan/radv_cmd_buffer.c:4901
#6 0x0000000001a69bf1 in vkCmdDrawIndexed (commandBuffer=<optimized out>, indexCount=indexCount@entry=6, instanceCount=instanceCount@entry=1, firstIndex=<optimized out>, vertexOffset=vertexOffset@entry=0,
firstInstance=firstInstance@entry=0) at thirdparty/vulkan/loader/trampoline.c:1698
#7 0x0000000001a1beb7 in RenderingDeviceVulkan::draw_list_draw (this=0x62ac330, p_list=<optimized out>, p_use_indices=<optimized out>, p_instances=1, p_procedural_vertices=0)
at drivers/vulkan/rendering_device_vulkan.cpp:6228
#8 0x0000000003362a46 in RasterizerCanvasRD::_render_item (this=this@entry=0x7fffe8427020, p_draw_list=p_draw_list@entry=2, p_item=p_item@entry=0x8d31ca0, p_framebuffer_format=p_framebuffer_format@entry=2,
p_canvas_transform_inverse=..., current_clip=@0x7fffffffc758: 0x0, p_lights=<optimized out>, p_pipeline_variants=<optimized out>) at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:788
#9 0x00000000033648c9 in RasterizerCanvasRD::_render_items (this=this@entry=0x7fffe8427020, p_to_render_target=..., p_to_render_target@entry=..., p_item_count=<optimized out>, p_canvas_transform_inverse=...,
p_lights=p_lights@entry=0x0, p_screen_uniform_set=..., p_screen_uniform_set@entry=...) at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1362
#10 0x000000000336552f in RasterizerCanvasRD::canvas_render_items (this=0x7fffe8427020, p_to_render_target=..., p_item_list=<optimized out>, p_modulate=..., p_light_list=0x0, p_canvas_transform=...)
at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1524
#11 0x0000000003354900 in RenderingServerCanvas::_render_canvas_item_tree (this=this@entry=0x6654f90, p_to_render_target=..., p_to_render_target@entry=..., p_child_items=<optimized out>,
p_child_item_count=p_child_item_count@entry=1, p_canvas_item=p_canvas_item@entry=0x0, p_transform=..., p_clip_rect=..., p_modulate=..., p_lights=0x0) at servers/rendering/rendering_server_canvas.cpp:71
#12 0x0000000003354a63 in RenderingServerCanvas::render_canvas (this=0x6654f90, p_render_target=..., p_canvas=p_canvas@entry=0x7735320, p_transform=..., p_lights=0x0, p_masked_lights=p_masked_lights@entry=0x0,
p_clip_rect=...) at servers/rendering/rendering_server_canvas.cpp:265
#13 0x00000000031bd038 in RenderingServerViewport::_draw_viewport (this=this@entry=0x677ad90, p_viewport=p_viewport@entry=0x77342d0, p_eye=p_eye@entry=XRInterface::EYE_MONO)
at servers/rendering/rendering_server_viewport.cpp:257
#14 0x00000000031be22a in RenderingServerViewport::draw_viewports (this=0x677ad90) at servers/rendering/rendering_server_viewport.cpp:413
#15 0x000000000319aefe in RenderingServerRaster::draw (this=0x6076660, p_swap_buffers=<optimized out>, frame_step=0.36177700757980347) at servers/rendering/rendering_server_raster.cpp:112
#16 0x00000000031c3680 in RenderingServerWrapMT::draw (this=0x6caea70, p_swap_buffers=<optimized out>, frame_step=0.36177700757980347) at servers/rendering/rendering_server_wrap_mt.cpp:91
#17 0x0000000000e3dc06 in Main::iteration () at main/main.cpp:2204
--Type <RET> for more, q to quit, c to continue without paging--
#18 0x0000000000e1e256 in OS_LinuxBSD::run (this=this@entry=0x7fffffffd1c0) at platform/linuxbsd/os_linuxbsd.cpp:238
#19 0x0000000000e1d1a7 in main (argc=1, argv=0x7fffffffd6b8) at platform/linuxbsd/godot_linuxbsd.cpp:55
Tested with Clang 10.0.0 with target=debug CCFLAGS="-O2"
, and it's not crashing. So it might be a GCC optimization bug more than a Mesa driver bug.
Mesa bug report: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3309
Mesa bug report: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3309
As pointed out on that report, it's of course Godot's own optimization level that triggers the issue, and not Mesa's (system package built with -O2
), so it's in Godot's Vulkan backend that GCC seems to optimize something wrongly, leading to a radv crash.
@marxin Do you have any suggestion on how to debug this further to make a useful bug report for GCC (if it's confirmed as a compiler optimization bug)?
@marxin Do you have any suggestion on how to debug this further to make a useful bug report for GCC (if it's confirmed as a compiler optimization bug)?
Hey. The bug reminds me of something I saw a month ago: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2977
Can you please check that the Mesa driver built as part of Godot build uses -flifetime-dse=1
?
Can you please check that the Mesa driver built as part of Godot build uses -flifetime-dse=1?
I'm using the Mesa version packaged by Mageia (currently 20.1.1, though there's 20.1.4 checked-in in testing repos but it's in the middle of switching to libglvnd so I haven't tried it yet).
I just rebuilt 20.1.1-11.mga8 locally to confirm, it does seem to use -flifetime-dse=1
:
Compiler for C++ supports arguments -flifetime-dse=1: YES
...
[91/2576] c++ -Isrc/compiler/libcompiler.a.p -Isrc/compiler -I../src/compiler -Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -Iinclude -I../include -Isrc -I../src -I../src/gallium/include -Isrc/gallium/auxiliary -I../src/gallium/auxiliary -I/usr/include/valgrind -fdiagnostics-color=always -DNDEBUG -pipe -D_FILE_OFFSET_BITS=64 -std=gnu++14 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DPACKAGE_VERSION="20.1.1"' '-DPACKAGE_BUGREPORT="https://gitlab.freedesktop.org/mesa/mesa/-/issues"' -DUSE_ELF_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=1 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -DHAVE_UINT128 -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_EXECINFO_H -DHAVE_SYS_SHM_H -DHAVE_CET_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_RANDOM_R -DHAVE_FLOCK -DHAVE_STRTOK_R -DHAVE_PROGRAM_INVOCATION_NAME -DHAVE_POSIX_MEMALIGN -DHAVE_DIRENT_D_TYPE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_ZSTD -DHAVE_PTHREAD -DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DLLVM_AVAILABLE '-DMESA_LLVM_VERSION_STRING="10.0.0"' -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -DHAVE_GALLIUM_EXTRA_HUD=1 -DHAVE_LIBSENSORS=1 -Werror=return-type -Werror=empty-body -Wno-non-virtual-dtor -Wno-missing-field-initializers -Wno-format-truncation -fno-math-errno -fno-trapping-math -flifetime-dse=1 -Werror=format -Wformat-security -O2 -g -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4 -fasynchronous-unwind-tables -fPIC -fvisibility=hidden -Werror=pointer-arith -Werror=vla -MD -MQ src/compiler/libcompiler.a.p/nir_types.cpp.o -MF src/compiler/libcompiler.a.p/nir_types.cpp.o.d -o src/compiler/libcompiler.a.p/nir_types.cpp.o -c ../src/compiler/nir_types.cpp
Ok then.
Can you please experiment with ASAN/UBSAN for a locally built Mesa driver? Or you can also try valgrind, it may tell something. Am I right that any debugging is problematic as one needs an AMD graphics that uses the problematic driver?
Can you please experiment with ASAN/UBSAN for a locally built Mesa driver? Or you can also try valgrind, it may tell something.
I'll give it a try. Do you mean that I need to rebuild Mesa with ASAN/UBSAN locally (and then Godot with ASAN/UBSAN too), or do I only need to build Godot with those?
Am I right that any debugging is problematic as one needs an AMD graphics that uses the problematic driver?
I have yet to get confirmations from other users, but in theory if you have an AMD GPU that uses Mesa/radv, you might be able to trigger the crash. It's unknown yet whether all AMD GPUs and/or all recent Mesa versions are affected, or if it's specific to a given series of GPUs or distro packaging quirks.
I'll give it a try. Do you mean that I need to rebuild Mesa with ASAN/UBSAN locally (and then Godot with ASAN/UBSAN too), or do I only need to build Godot with those?
I would first start with Godot and then you can repeat the same step for Mesa. Anyway, you may be lucky with just valgrind
.
To be honest, I would be quite surprised that a GCC miscompilation would happen with -O1
optimization level.
valgrind
does seem to find some invalid reads:
$ valgrind ./bin/godot.linuxbsd.tools.64
==441176== Memcheck, a memory error detector
==441176== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==441176== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==441176== Command: ./bin/godot.linuxbsd.tools.64
==441176==
Godot Engine v4.0.dev.custom_build.5700429e4 - https://godotengine.org
DisplayServer::_create_window 0 want rect: 448, 240, 1024, 600 got rect 0, 0, 1024, 600
DisplayServer::_window_changed: 0 rect: 896, 29, 1024, 600
==441176== Conditional jump or move depends on uninitialised value(s)
==441176== at 0xE33B68: DisplayServerX11::process_events() (display_server_x11.cpp:2490)
==441176== by 0xE240F6: OS_LinuxBSD::run() (os_linuxbsd.cpp:234)
==441176== by 0xE23064: main (godot_linuxbsd.cpp:58)
==441176==
==441176== Invalid read of size 2
==441176== at 0x65FA2EF: memmove (vg_replace_strmem.c:1270)
==441176== by 0x9ED48FE: UnknownInlinedFun (string_fortified.h:34)
==441176== by 0x9ED48FE: radv_flush_constants (radv_cmd_buffer.c:2433)
==441176== by 0x9ED5724: radv_upload_graphics_shader_descriptors (radv_cmd_buffer.c:2674)
==441176== by 0x9ED88AF: radv_draw (radv_cmd_buffer.c:4860)
==441176== by 0x9EDA84D: radv_CmdDrawIndexed (radv_cmd_buffer.c:4924)
==441176== by 0x1A496A6: vkCmdDrawIndexed (trampoline.c:1698)
==441176== by 0x19FB870: RenderingDeviceVulkan::draw_list_draw(long, bool, unsigned int, unsigned int) (rendering_device_vulkan.cpp:6228)
==441176== by 0x334FE9F: RasterizerCanvasRD::_render_item(long, RasterizerCanvas::Item const*, long, Transform2D const&, RasterizerCanvas::Item*&, RasterizerCanvas::Light*, RasterizerCanvasRD::PipelineVariants*) (rasterizer_canvas_rd.cpp:788)
==441176== by 0x3351D22: RasterizerCanvasRD::_render_items(RID, int, Transform2D const&, RasterizerCanvas::Light*, RID) (rasterizer_canvas_rd.cpp:1362)
==441176== by 0x3352988: RasterizerCanvasRD::canvas_render_items(RID, RasterizerCanvas::Item*, Color const&, RasterizerCanvas::Light*, Transform2D const&) (rasterizer_canvas_rd.cpp:1524)
==441176== by 0x3341D59: RenderingServerCanvas::_render_canvas_item_tree(RID, RenderingServerCanvas::Canvas::ChildItem*, int, RenderingServerCanvas::Item*, Transform2D const&, Rect2 const&, Color const&, RasterizerCanvas::Light*) (rendering_server_canvas.cpp:71)
==441176== by 0x3341EBC: RenderingServerCanvas::render_canvas(RID, RenderingServerCanvas::Canvas*, Transform2D const&, RasterizerCanvas::Light*, RasterizerCanvas::Light*, Rect2 const&) (rendering_server_canvas.cpp:265)
==441176== Address 0x92a3cc0 is 0 bytes after a block of size 3,744 alloc'd
==441176== at 0x65F3751: malloc (vg_replace_malloc.c:307)
==441176== by 0x9ED8E09: UnknownInlinedFun (vk_alloc.h:36)
==441176== by 0x9ED8E09: UnknownInlinedFun (vk_alloc.h:44)
==441176== by 0x9ED8E09: radv_create_cmd_buffer (radv_cmd_buffer.c:275)
==441176== by 0x9ED8E09: radv_AllocateCommandBuffers (radv_cmd_buffer.c:3349)
==441176== by 0x1A49517: vkAllocateCommandBuffers (trampoline.c:1516)
==441176== by 0x1A0261C: RenderingDeviceVulkan::initialize(VulkanContext*, bool) (rendering_device_vulkan.cpp:7186)
==441176== by 0xE38F40: DisplayServerX11::DisplayServerX11(String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server_x11.cpp:3662)
==441176== by 0xE39441: DisplayServerX11::create_func(String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server_x11.cpp:3188)
==441176== by 0x2F930C8: DisplayServer::create(int, String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server.cpp:599)
==441176== by 0xE44E20: Main::setup2(unsigned long) (main.cpp:1409)
==441176== by 0xE4F349: Main::setup(char const*, int, char**, bool) (main.cpp:1331)
==441176== by 0xE2300D: main (godot_linuxbsd.cpp:51)
==441176==
==441176== Invalid read of size 2
==441176== at 0x65FA2E0: memmove (vg_replace_strmem.c:1270)
==441176== by 0x9ED48FE: UnknownInlinedFun (string_fortified.h:34)
==441176== by 0x9ED48FE: radv_flush_constants (radv_cmd_buffer.c:2433)
==441176== by 0x9ED5724: radv_upload_graphics_shader_descriptors (radv_cmd_buffer.c:2674)
==441176== by 0x9ED88AF: radv_draw (radv_cmd_buffer.c:4860)
==441176== by 0x9EDA84D: radv_CmdDrawIndexed (radv_cmd_buffer.c:4924)
==441176== by 0x1A496A6: vkCmdDrawIndexed (trampoline.c:1698)
==441176== by 0x19FB870: RenderingDeviceVulkan::draw_list_draw(long, bool, unsigned int, unsigned int) (rendering_device_vulkan.cpp:6228)
==441176== by 0x334FE9F: RasterizerCanvasRD::_render_item(long, RasterizerCanvas::Item const*, long, Transform2D const&, RasterizerCanvas::Item*&, RasterizerCanvas::Light*, RasterizerCanvasRD::PipelineVariants*) (rasterizer_canvas_rd.cpp:788)
==441176== by 0x3351D22: RasterizerCanvasRD::_render_items(RID, int, Transform2D const&, RasterizerCanvas::Light*, RID) (rasterizer_canvas_rd.cpp:1362)
==441176== by 0x3352988: RasterizerCanvasRD::canvas_render_items(RID, RasterizerCanvas::Item*, Color const&, RasterizerCanvas::Light*, Transform2D const&) (rasterizer_canvas_rd.cpp:1524)
==441176== by 0x3341D59: RenderingServerCanvas::_render_canvas_item_tree(RID, RenderingServerCanvas::Canvas::ChildItem*, int, RenderingServerCanvas::Item*, Transform2D const&, Rect2 const&, Color const&, RasterizerCanvas::Light*) (rendering_server_canvas.cpp:71)
==441176== by 0x3341EBC: RenderingServerCanvas::render_canvas(RID, RenderingServerCanvas::Canvas*, Transform2D const&, RasterizerCanvas::Light*, RasterizerCanvas::Light*, Rect2 const&) (rendering_server_canvas.cpp:265)
==441176== Address 0x92a3cc6 is 6 bytes after a block of size 3,744 alloc'd
==441176== at 0x65F3751: malloc (vg_replace_malloc.c:307)
==441176== by 0x9ED8E09: UnknownInlinedFun (vk_alloc.h:36)
==441176== by 0x9ED8E09: UnknownInlinedFun (vk_alloc.h:44)
==441176== by 0x9ED8E09: radv_create_cmd_buffer (radv_cmd_buffer.c:275)
==441176== by 0x9ED8E09: radv_AllocateCommandBuffers (radv_cmd_buffer.c:3349)
==441176== by 0x1A49517: vkAllocateCommandBuffers (trampoline.c:1516)
==441176== by 0x1A0261C: RenderingDeviceVulkan::initialize(VulkanContext*, bool) (rendering_device_vulkan.cpp:7186)
==441176== by 0xE38F40: DisplayServerX11::DisplayServerX11(String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server_x11.cpp:3662)
==441176== by 0xE39441: DisplayServerX11::create_func(String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server_x11.cpp:3188)
==441176== by 0x2F930C8: DisplayServer::create(int, String const&, DisplayServer::WindowMode, unsigned int, Vector2i const&, Error&) (display_server.cpp:599)
==441176== by 0xE44E20: Main::setup2(unsigned long) (main.cpp:1409)
==441176== by 0xE4F349: Main::setup(char const*, int, char**, bool) (main.cpp:1331)
==441176== by 0xE2300D: main (godot_linuxbsd.cpp:51)
==441176==
handle_crash: Program crashed with signal 11
Dumping the backtrace. Please include this when reporting the bug on https://github.com/godotengine/godot/issues
[1] /lib64/libc.so.6(+0x3a3f0) [0x6d163f0] (??:0)
[2] /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so(_vgr20181ZZ_libcZdsoZa_memmove+0x15f) [0x65fa2ef] (??:0)
[3] /usr/lib64/libvulkan_radeon.so(+0xba8ff) [0x9ed48ff] (??:0)
[4] /usr/lib64/libvulkan_radeon.so(+0xbb725) [0x9ed5725] (??:0)
[5] /usr/lib64/libvulkan_radeon.so(+0xbe8b0) [0x9ed88b0] (??:0)
[6] /usr/lib64/libvulkan_radeon.so(+0xc084e) [0x9eda84e] (??:0)
[7] ./bin/godot.linuxbsd.tools.64(vkCmdDrawIndexed+0xd) [0x1a496a7] (/home/akien/Projects/godot/godot.git/thirdparty/vulkan/loader/trampoline.c:1699)
[8] RenderingDeviceVulkan::draw_list_draw(long, bool, unsigned int, unsigned int) (/home/akien/Projects/godot/godot.git/drivers/vulkan/rendering_device_vulkan.cpp:6228 (discriminator 2))
[9] RasterizerCanvasRD::_render_item(long, RasterizerCanvas::Item const*, long, Transform2D const&, RasterizerCanvas::Item*&, RasterizerCanvas::Light*, RasterizerCanvasRD::PipelineVariants*) (/home/akien/Projects/godot/godot.git/servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1289)
[10] RasterizerCanvasRD::_render_items(RID, int, Transform2D const&, RasterizerCanvas::Light*, RID) (/home/akien/Projects/godot/godot.git/servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1362)
[11] RasterizerCanvasRD::canvas_render_items(RID, RasterizerCanvas::Item*, Color const&, RasterizerCanvas::Light*, Transform2D const&) (/home/akien/Projects/godot/godot.git/servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1529)
[12] RenderingServerCanvas::_render_canvas_item_tree(RID, RenderingServerCanvas::Canvas::ChildItem*, int, RenderingServerCanvas::Item*, Transform2D const&, Rect2 const&, Color const&, RasterizerCanvas::Light*) (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_canvas.cpp:72 (discriminator 5))
[13] RenderingServerCanvas::render_canvas(RID, RenderingServerCanvas::Canvas*, Transform2D const&, RasterizerCanvas::Light*, RasterizerCanvas::Light*, Rect2 const&) (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_canvas.cpp:265)
[14] RenderingServerViewport::_draw_viewport(RenderingServerViewport::Viewport*, XRInterface::Eyes) (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_viewport.cpp:257)
[15] RenderingServerViewport::draw_viewports() (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_viewport.cpp:415)
[16] RenderingServerRaster::draw(bool, double) (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_raster.cpp:113)
[17] RenderingServerWrapMT::draw(bool, double) (/home/akien/Projects/godot/godot.git/servers/rendering/rendering_server_wrap_mt.cpp:93)
[18] Main::iteration() (/home/akien/Projects/godot/godot.git/main/main.cpp:2377)
[19] OS_LinuxBSD::run() (/home/akien/Projects/godot/godot.git/platform/linuxbsd/os_linuxbsd.cpp:238)
[20] ./bin/godot.linuxbsd.tools.64(main+0xf3) [0xe23065] (/home/akien/Projects/godot/godot.git/platform/linuxbsd/godot_linuxbsd.cpp:60)
[21] /lib64/libc.so.6(__libc_start_main+0xea) [0x6d02cda] (??:0)
[22] ./bin/godot.linuxbsd.tools.64(_start+0x2a) [0xe22eca] (??:?)
-- END OF BACKTRACE --
==441176==
==441176== Process terminating with default action of signal 6 (SIGABRT): dumping core
==441176== at 0x6D16380: raise (in /usr/lib64/libc-2.31.so)
==441176== by 0x6D01526: abort (in /usr/lib64/libc-2.31.so)
==441176== by 0xE23E34: handle_crash(int) (crash_handler_linuxbsd.cpp:120)
==441176== by 0x6D163EF: ??? (in /usr/lib64/libc-2.31.so)
==441176== by 0x65FA2EE: memmove (vg_replace_strmem.c:1270)
==441176== by 0x9ED48FE: UnknownInlinedFun (string_fortified.h:34)
==441176== by 0x9ED48FE: radv_flush_constants (radv_cmd_buffer.c:2433)
==441176== by 0x9ED5724: radv_upload_graphics_shader_descriptors (radv_cmd_buffer.c:2674)
==441176== by 0x9ED88AF: radv_draw (radv_cmd_buffer.c:4860)
==441176== by 0x9EDA84D: radv_CmdDrawIndexed (radv_cmd_buffer.c:4924)
==441176== by 0x1A496A6: vkCmdDrawIndexed (trampoline.c:1698)
==441176== by 0x19FB870: RenderingDeviceVulkan::draw_list_draw(long, bool, unsigned int, unsigned int) (rendering_device_vulkan.cpp:6228)
==441176== by 0x334FE9F: RasterizerCanvasRD::_render_item(long, RasterizerCanvas::Item const*, long, Transform2D const&, RasterizerCanvas::Item*&, RasterizerCanvas::Light*, RasterizerCanvasRD::PipelineVariants*) (rasterizer_canvas_rd.cpp:788)
==441176==
==441176== HEAP SUMMARY:
==441176== in use at exit: 103,100,524 bytes in 216,800 blocks
==441176== total heap usage: 4,079,899 allocs, 3,863,099 frees, 2,057,016,341 bytes allocated
==441176==
==441176== LEAK SUMMARY:
==441176== definitely lost: 7,752 bytes in 63 blocks
==441176== indirectly lost: 10,392 bytes in 63 blocks
==441176== possibly lost: 78,787,894 bytes in 200,823 blocks
==441176== still reachable: 24,294,438 bytes in 15,849 blocks
==441176== of which reachable via heuristic:
==441176== newarray : 262,808 bytes in 20 blocks
==441176== multipleinheritance: 3,744 bytes in 6 blocks
==441176== suppressed: 48 bytes in 2 blocks
==441176== Rerun with --leak-check=full to see details of leaked memory
==441176==
==441176== Use --track-origins=yes to see where uninitialised values come from
==441176== For lists of detected and suppressed errors, rerun with: -s
==441176== ERROR SUMMARY: 318642 errors from 3 contexts (suppressed: 0 from 0)
Aborted (core dumped)
Godot 9856c8fda, release-debug tools, built with GCC 10.0.1 (10-20200411-0ubuntu1), running on Mac Pro 2013.
System: Host: vmatest-MacPro Kernel: 5.4.0-42-generic x86_64 bits: 64 compiler: gcc v: 9.3.0 Desktop: Gnome 3.36.3
wm: gnome-shell dm: GDM3 Distro: Ubuntu 20.04.1 LTS (Focal Fossa)
CPU: Topology: Quad Core model: Intel Xeon E5-1620 v2 bits: 64 type: MT MCP arch: Ivy Bridge rev: 4 L2 cache: 10.0 MiB
flags: avx lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 59201
Speed: 1201 MHz min/max: 1200/3900 MHz Core speeds (MHz): 1: 1201 2: 1200 3: 1200 4: 1202 5: 1200 6: 1199 7: 1200
8: 1202
Graphics: Device-1: AMD Curacao XT / Trinidad XT [Radeon R7 370 / R9 270X/370X] vendor: Apple FirePro D300 driver: amdgpu
v: kernel bus ID: 02:00.0 chip ID: 1002:6810
Device-2: AMD Curacao XT / Trinidad XT [Radeon R7 370 / R9 270X/370X] vendor: Apple FirePro D300 driver: amdgpu
v: kernel bus ID: 06:00.0 chip ID: 1002:6810
Display: x11 server: X.Org 1.20.8 driver: amdgpu compositor: gnome-shell resolution: 3840x2160~30Hz, 2560x1440~60Hz
OpenGL: renderer: AMD Radeon HD 8800 Series (PITCAIRN DRM 3.35.0 5.4.0-42-generic LLVM 10.0.1)
v: 4.6 Mesa 20.2.0-devel (git-e2e89fb 2020-07-25 focal-oibaf-ppa) direct render: Yes
Godot is not crashing and not printing any errors, window is responding to events, but nothing except editor UI background color is displayed (after resizing window become black).
valgrind
does seem to find some invalid reads:
Having the reported valgrind errors, have you identified a root case of the crash?
I'm not very literate in valgrind but to me it seems to point to an invalid read in mesa, but that's about the same we know from the segfault already.
Here you create a command buffer:
maybe the buffer is in a bad state and Mesa
than reads more that it ought to? I would dump its content at the place it crashes.
Thanks, I'll try that.
BTW, I just tried on a different computer with a Radeon RX Vega 56 (VEGA10) and it crashes in the same place. It has the exact same distro installed though so it doesn't exclude a distro-specific issue, though I think @RevoluPowered yesterday was experiencing the same crash with AMD on Linux with a different distro.
Thanks, I'll try that.
BTW, I just tried on a different computer with a Radeon RX Vega 56 (VEGA10) and it crashes in the same place. It has the exact same distro installed though so it doesn't exclude a distro-specific issue, though I think @RevoluPowered yesterday was experiencing the same crash with AMD on Linux with a different distro.
Yeah I received the same crash, definitely
@akien-mga I wonder if we can write some test for this logic?
I'm using a Radeon RX 5700 on Archlinux (Godot version via AUR: 4.0.dev.custom_build.de465c41d
, GCC version: 10.2.0
and Mesa version: 20.1.7
). I get these black windows too and if I click on them, it reacts like a GPU hangup I got more often when the Mesa drivers weren't stable for Navi. Everything freezes until the X-server restarts and after that the whole system doesn't respond except changing to another TTY to restart works. If I wait even longer the response to changing a TTY also gets denied.
Despite the reaction from trying to interact with the misrendered windows, I don't get any error output from Godot itself (even with verbose output).
@TheJackiMonster Try starting Godot with the --single-window
command line argument. Make sure to open the editor directly when using it, as the project manager won't pass it on to the editor when opening a project. You can do this using godot --single-window /path/to/project/project.godot
(assuming godot
is in your PATH
).
This workaround is normally used for a NVIDIA driver regression, but we never know…
@Calinou Using the parameter gives me very similar results. It opens a project but the whole editor is black as well. The terminal which I used to run godot shows be a lot of error output (mostly: ERROR: Condition "O == nullptr" is true. Continuing. at: build (core/math/quick_hull.cpp:382)
).
Interacting with the window results in the GPU reset like before. I actually looked up the output from dmesg and found this:
[ 222.800406] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 222.800513] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 227.931677] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=3773, emitted seq=3776
[ 227.931773] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process godot pid 2267 thread godot pid 2267
[ 227.931779] amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
[ 231.932675] amdgpu 0000:0c:00.0: amdgpu: failed to suspend display audio
[ 231.932949] ------------[ cut here ]------------
[ 231.933010] WARNING: CPU: 4 PID: 271 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:3194 dcn20_validate_bandwidth_fp+0x7a/0xb0 [amdgpu]
[ 231.933010] Modules linked in: fuse rfcomm hwmon_vid cmac algif_hash algif_skcipher af_alg bnep btusb uvcvideo btrtl btbcm btintel videobuf2_vmalloc videobuf2_memops bluetooth videobuf2_v4l2 videobuf2_common ecdh_generic videodev ecc input_leds nls_iso8859_1 nls_cp437 vfat fat eeepc_wmi asus_wmi battery sparse_keymap wmi_bmof mxm_wmi hid_steam snd_hda_codec_realtek rtw88_8822be rtw88_8822b snd_hda_codec_generic rtw88_pci ledtrig_audio rtw88_core snd_hda_codec_hdmi edac_mce_amd kvm_amd snd_hda_intel mousedev snd_intel_dspcfg mac80211 kvm snd_hda_codec snd_usb_audio snd_hda_core snd_usbmidi_lib snd_rawmidi snd_seq_device mc irqbypass snd_hwdep crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cfg80211 hid_uclogic joydev snd_pcm aesni_intel blackmagic_io(POE) igb crypto_simd snd_timer cryptd ccp snd glue_helper rfkill sp5100_tco rapl pcspkr k10temp rng_core i2c_piix4 soundcore libarc4 dca evdev wmi mac_hid pinctrl_amd gpio_amdpt acpi_cpufreq vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE)
[ 231.933027] crypto_user ip_tables x_tables radeon hid_generic usbhid hid uas usb_storage ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel xhci_pci xhci_pci_renesas xhci_hcd amdgpu gpu_sched i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core ttm drm agpgart
[ 231.933034] CPU: 4 PID: 271 Comm: kworker/4:2 Tainted: P OE 5.8.8-arch1-1 #1
[ 231.933035] Hardware name: System manufacturer System Product Name/ROG CROSSHAIR VI EXTREME, BIOS 7501 09/23/2019
[ 231.933038] Workqueue: events drm_sched_job_timedout [gpu_sched]
[ 231.933093] RIP: 0010:dcn20_validate_bandwidth_fp+0x7a/0xb0 [amdgpu]
[ 231.933094] Code: 00 7b 35 22 85 a8 1e 00 00 75 2f 31 d2 f2 0f 11 85 d8 25 00 00 48 89 ee 4c 89 e7 e8 50 f6 ff ff 89 c2 22 95 a8 1e 00 00 75 2a <0f> 0b 48 89 9d d8 25 00 00 5b 5d 41 5c c3 75 c9 48 89 9d d8 25 00
[ 231.933095] RSP: 0018:ffffbbc000acbc20 EFLAGS: 00010246
[ 231.933096] RAX: 0000000000000001 RBX: 4079400000000000 RCX: 000000000061c404
[ 231.933096] RDX: 0000000000000000 RSI: 6dea001126d93bca RDI: 00000000000311a0
[ 231.933096] RBP: ffffa30080820000 R08: ffffa300d6fd6000 R09: ffffa300ec150000
[ 231.933097] R10: ffffa300d6fd6000 R11: 0000000100000001 R12: ffffa300ec150000
[ 231.933097] R13: ffffbbc000acbce0 R14: ffffa300efd98400 R15: ffffa30080820000
[ 231.933098] FS: 0000000000000000(0000) GS:ffffa300feb00000(0000) knlGS:0000000000000000
[ 231.933099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 231.933099] CR2: 000055d981e71088 CR3: 00000007a4e3a000 CR4: 00000000003406e0
[ 231.933099] Call Trace:
[ 231.933157] dcn20_validate_bandwidth+0x24/0x40 [amdgpu]
[ 231.933208] dc_validate_global_state+0x2f2/0x390 [amdgpu]
[ 231.933259] ? dc_rem_all_planes_for_stream+0xcb/0x110 [amdgpu]
[ 231.933314] amdgpu_dm_commit_zero_streams+0xfe/0x140 [amdgpu]
[ 231.933369] dm_suspend+0x9a/0xb0 [amdgpu]
[ 231.933407] amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
[ 231.933446] ? amdgpu_fence_process+0x4d/0x140 [amdgpu]
[ 231.933483] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
[ 231.933535] amdgpu_device_gpu_recover.cold+0x653/0xfd4 [amdgpu]
[ 231.933584] amdgpu_job_timedout+0x121/0x140 [amdgpu]
[ 231.933586] drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
[ 231.933589] process_one_work+0x1da/0x3d0
[ 231.933590] worker_thread+0x4d/0x3d0
[ 231.933591] ? rescuer_thread+0x410/0x410
[ 231.933592] kthread+0x142/0x160
[ 231.933593] ? __kthread_bind_mask+0x60/0x60
[ 231.933595] ret_from_fork+0x22/0x30
[ 231.933597] ---[ end trace 7c3dc8ab84e57f3c ]---
So maybe this helps somehow to fix it. ^^'
Because someone wrote it's a compilation specific problem I tried to compile everything with LLVM instead of GCC. Now I get a crash on startup:
[New Thread 0x7fffce0fc640 (LWP 6957)]
DisplayServer::_create_window 1 want rect: 709, 491, 501, 97 got rect 709, 491, 501, 97
[Thread 0x7fffe8be0640 (LWP 6938) exited]
[New Thread 0x7fffe8be0640 (LWP 6958)]
DisplayServer::_window_changed: 0 rect: 768, 457, 1024, 600
Thread 1 "godot.linuxbsd." received signal SIGSEGV, Segmentation fault.
0x00007ffff786dca8 in __memmove_avx_unaligned_erms () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0x00007ffff786dca8 in __memmove_avx_unaligned_erms () from /usr/lib/libc.so.6
#1 0x00007ffff658f970 in ?? () from /usr/lib/libvulkan_radeon.so
#2 0x00007ffff658fad3 in ?? () from /usr/lib/libvulkan_radeon.so
#3 0x00007ffff659749c in ?? () from /usr/lib/libvulkan_radeon.so
#4 0x00007ffff659765e in ?? () from /usr/lib/libvulkan_radeon.so
#5 0x0000000001249b46 in draw_list_draw () at drivers/vulkan/rendering_device_vulkan.cpp:6228
#6 0x000000000318ab81 in _render_item () at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:919
#7 0x000000000318bb17 in _render_items () at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1362
#8 0x000000000318c3b0 in canvas_render_items () at servers/rendering/rasterizer_rd/rasterizer_canvas_rd.cpp:1524
#9 0x00000000031738ea in _render_canvas_item_tree () at servers/rendering/rendering_server_canvas.cpp:71
#10 0x00000000031747f5 in render_canvas () at servers/rendering/rendering_server_canvas.cpp:265
#11 0x0000000002f94804 in _draw_viewport () at servers/rendering/rendering_server_viewport.cpp:257
#12 0x0000000002f95314 in draw_viewports () at servers/rendering/rendering_server_viewport.cpp:413
#13 0x0000000002f6b2d0 in draw () at servers/rendering/rendering_server_raster.cpp:112
#14 0x000000000049d130 in iteration () at main/main.cpp:2437
#15 0x0000000000466ec6 in run () at platform/linuxbsd/os_linuxbsd.cpp:240
#16 0x0000000000464c03 in main () at platform/linuxbsd/godot_linuxbsd.cpp:58
So I would assume the problem comes from the newer vulkan-device-drivers for Godot? Vulkan is also really susceptible to bugs on specifically AMD GPUs because other drivers (for example from Nvidia) don't report all types of wrong behavior regarding to the documentation.
I used scons platform=linuxbsd target=debug use_llvm=yes -j$(nproc)
and scons platform=linuxbsd target=release_debug use_llvm=yes -j$(nproc)
to compile. Both resulted in the same crash.
@TheJackiMonster Please open a dedicated bug report for this. This issue is specifically about a segfault that happens when using optimizations with GCC. Not with debug
builds, and not with LLVM.
The root cause in Godot's code might be similar, but until this is established, they should be treated as separate bugs.
Godot version:
master
(0cd98ec7e13038d09a77cf821e930be79026f943)OS/device including version: Mageia 8 (Linux)
Issue specific to the AMD GPU, using the
radv
driver from Mesa 20.1.1.Compiling with:
Issue description: While I can run Godot fine with a
tools=yes target=debug
build (default parameters), an optimizedtools=yes target=release_debug
build crashes on the AMD Radeon RX Vega M GPU with Mesa'sradv
.If I force the use of the Intel GPU, it doesn't crash.
Backtrace (I built with
target=debug debug_symbols=full
to have more debug info):This is likely a Mesa driver bug, but opening an issue here to track it and see if it might be triggered by something specific we do.
The only difference between
target=debug
andtarget=release_debug
is the optimization level (-O2
) and the use of-rdynamic
. I suspect that the optimization might be the problem, I'll do some tests to figure it out, and open a Mesa bug report.Steps to reproduce:
scons p=linuxbsd tools=yes target=release_debug debug_symbols=full
Minimal reproduction project: Crashes even in the project manager.