godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
91.22k stars 21.22k forks source link

Segfault in `template_release` build due to optimization level (GCC `-O2`) #85263

Closed akien-mga closed 12 months ago

akien-mga commented 12 months ago

Godot version

4.2.rc2 (ad72de508363ca8d10c6b148be44a02cdf12be13)

System information

Mageia 9 - Vulkan (Forward+) - dedicated AMD Radeon RX Vega M GL Graphics (RADV VEGAM) () - Intel(R) Core(TM) i7-8705G CPU @ 3.10GHz (8 Threads)

Issue description

Spin-off from #70910, where the MRP from https://github.com/godotengine/godot/issues/70910#issuecomment-1758935677 seems to trigger a segfault (on Linux with GCC at least), but only with 4.2 builds, so it appears to be a regression of some sort.

The crash is reproduced when compiling Godot with scons p=linuxbsd target=template_release, to which debug_symbols=yes can be added to have a nice stacktrace.

Here's the log output and stacktrace:

ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:1033)
ERROR: Parameter "canvas_item" is null.
   at: canvas_item_clear (./servers/rendering/renderer_canvas_cull.cpp:1558)

Thread 1 "godot.linuxbsd." received signal SIGSEGV, Segmentation fault.
0x0000000003c79105 in Object::notification (this=0x6159910, p_notification=30, p_reversed=<optimized out>) at ./core/object/object.cpp:837
837                     _notificationv(p_notification, p_reversed);
(gdb) bt
#0  0x0000000003c79105 in Object::notification (this=0x6159910, p_notification=30, p_reversed=<optimized out>) at ./core/object/object.cpp:837
#1  0x0000000001fa3557 in CanvasItem::_redraw_callback (this=0x6159910) at ./scene/main/canvas_item.cpp:140
#2  CanvasItem::_redraw_callback (this=0x6159910) at ./scene/main/canvas_item.cpp:129
#3  0x0000000003c7e72d in CallQueue::_call_function (this=this@entry=0x44c85d0, p_callable=..., p_args=p_args@entry=0x5fdc238, p_argcount=0, p_show_error=<optimized out>) at ./core/object/message_queue.cpp:219
#4  0x0000000003c97425 in CallQueue::flush (this=0x44c85d0) at ./core/object/message_queue.cpp:324
#5  0x0000000002052afc in SceneTree::physics_process (this=0x5feb600, p_time=0.016666666666666666) at ./scene/main/scene_tree.cpp:471
#6  0x000000000108f2f4 in Main::iteration () at main/main.cpp:3598
#7  0x00000000010244c1 in OS_LinuxBSD::run (this=this@entry=0x7fffffffcf80) at platform/linuxbsd/os_linuxbsd.cpp:933
#8  0x0000000001023afb in main (argc=<optimized out>, argv=0x7fffffffd538) at platform/linuxbsd/godot_linuxbsd.cpp:74
(gdb) 
(gdb) frame 0
#0  0x0000000003c79105 in Object::notification (this=0x6159910, p_notification=30, p_reversed=<optimized out>) at ./core/object/object.cpp:837
837                     _notificationv(p_notification, p_reversed);
(gdb) print p_notification
$1 = 30
(gdb) print p_reversed
$2 = <optimized out>
(gdb) print this
$3 = (Object * const) 0x6159910
(gdb) print _notificationv
$4 = {void (Object * const, int, bool)} 0x1033be0 <Object::_notificationv(int, bool)>

The main problem seems to be the optimization level.

With target=template_release, the default optimization level is optimize=speed_trace, which for GCC sets -O2.

I tested a custom build with -O1 (optimize=debug CCFLAGS=-O1), which also reproduces the bug. That gives slightly more info in the stacktrace:

0x0000000003833083 in Object::notification (this=this@entry=0x5c4dbe0, p_notification=p_notification@entry=30, p_reversed=p_reversed@entry=false) at ./core/object/object.cpp:837
837                     _notificationv(p_notification, p_reversed);
(gdb) bt
#0  0x0000000003833083 in Object::notification (this=this@entry=0x5c4dbe0, p_notification=p_notification@entry=30, p_reversed=p_reversed@entry=false) at ./core/object/object.cpp:837
#1  0x0000000001ca06c4 in CanvasItem::_redraw_callback (this=0x5c4dbe0) at ./scene/main/canvas_item.cpp:140
#2  0x0000000001cc2129 in call_with_variant_args_helper<CanvasItem>(CanvasItem*, void (CanvasItem::*)(), Variant const**, Callable::CallError&, IndexSequence<>) (r_error=..., p_args=<optimized out>, p_method=<optimized out>, 
    p_instance=<optimized out>) at ./core/variant/binder_common.h:305
#3  call_with_variant_args<CanvasItem> (r_error=..., p_argcount=<optimized out>, p_args=<optimized out>, p_method=<optimized out>, p_instance=<optimized out>) at ./core/variant/binder_common.h:417
#4  CallableCustomMethodPointer<CanvasItem>::call (this=<optimized out>, p_arguments=<optimized out>, p_argcount=<optimized out>, r_return_value=..., r_call_error=...) at ./core/object/callable_method_pointer.h:104
#5  0x000000000363e2c1 in Callable::callp (this=this@entry=0x5ac8240, p_arguments=p_arguments@entry=0x0, p_argcount=p_argcount@entry=0, r_return_value=..., r_call_error=...) at ./core/variant/callable.cpp:57
#6  0x000000000383a2a8 in CallQueue::_call_function (this=this@entry=0x3fb35e0, p_callable=..., p_args=p_args@entry=0x5ac8258, p_argcount=0, p_show_error=<optimized out>) at ./core/object/message_queue.cpp:219
#7  0x000000000384e773 in CallQueue::flush (this=0x3fb35e0) at ./core/object/message_queue.cpp:324
#8  0x0000000001d53f34 in SceneTree::physics_process (this=0x5ad7400, p_time=0.016666666666666666) at ./scene/main/scene_tree.cpp:471
#9  0x000000000101d67a in Main::iteration () at main/main.cpp:3598
#10 0x0000000000fbe6a4 in OS_LinuxBSD::run (this=this@entry=0x7fffffffcf80) at platform/linuxbsd/os_linuxbsd.cpp:933
#11 0x0000000000fbe4a5 in main (argc=<optimized out>, argv=0x7fffffffd538) at platform/linuxbsd/godot_linuxbsd.cpp:74

With -O0 (optimize=none, or optimize=custom CCFLAGS=-O0), there's no crash, just endless error spam like in #70910.

Steps to reproduce

Minimal reproduction project

Freebug.zip

bruvzg commented 12 months ago

With template_release build on macOS, I'm getting a lot of ERROR: Condition "slot >= slot_max" is true. Returning: nullptr messages (but not endless spam) and no crash.

YuriSizov commented 12 months ago

With template_release build on macOS, I'm getting a lot of ERROR: Condition "slot >= slot_max" is true. Returning: nullptr messages (but not endless spam) and no crash.

Same for me on Windows (MSVC if that's important again).

Edit: By the way, added a custom message so there is at least some information in release builds, and the overflow is pretty impressive:

ERROR: Cannot get instance from ObjectDB, slot index 8957152 (id: 1254273625312) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 8957152 (id: 1254273625312) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 15307384 (id: -8646787037943721352) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 15307384 (id: -8646787037943721352) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 9782608 (id: 1254274450768) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 9782608 (id: 1254274450768) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 8776596 (id: -8646601220619375724) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)
ERROR: Cannot get instance from ObjectDB, slot index 8776596 (id: -8646601220619375724) is exceeding max 2048
   at: (C:\Projects\godot-engine\master\core/object/object.h:1034)

Edit2: Actually, what the hell? How is id printed here, which is supposed to be a uint64 value, negative? Or is this just a bug/limitation with vformat?

bruvzg commented 12 months ago

overflow is pretty impressive

It likely just random garbage from corrupted memory, the issue should be in the CallQueue.

Actually, what the hell? How is id printed here, which is supposed to be a uint64 value, negative?

GDScript int (and therefore vformat) is always signed 64-bit int.

bruvzg commented 12 months ago

The issues seem to be not in the CallQueue, but in the CallableCustomMethodPointer: