gazebosim / gz-sim

Open source robotics simulator. The latest version of Gazebo.
https://gazebosim.org
Apache License 2.0
709 stars 269 forks source link

Gazebo (Fortress) GUI crashes at exit when `lto` is enabled #2301

Closed azeey closed 7 months ago

azeey commented 9 months ago

Environment

Description

Steps to reproduce

  1. Build gz-sim with cmake -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON .
  2. Run ign gazebo -v4 shapes.sdf
  3. ctrl-c on the terminal

Output

Stack trace (most recent call last):
#30   Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#29   Object "ign gazebo gui", at 0x5591240d31c4, in _start
#28   Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fbad0029e3f, in __libc_start_main
#27   Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fbad0029d8f, in
#26   Object "ign gazebo gui", at 0x5591240d317e, in
#25   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad04a8e19, in ruby_run_node
#24   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad04a5317, in
#23   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad063a30c, in rb_vm_exec
#22   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad0634c96, in
#21   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad0631fc5, in
#20   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad062fc34, in
#19   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad057ba1e, in
#18   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad04a69ac, in rb_protect
#17   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad063ec61, in rb_yield
#16   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad063a30c, in rb_vm_exec
#15   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad0634c96, in
#14   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad0631fc5, in
#13   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad062fc34, in
#12   Object "/usr/lib/x86_64-linux-gnu/ruby/3.0.0/fiddle.so", at 0x7fbad023f44b, in
#11   Object "/lib/x86_64-linux-gnu/libruby-3.0.so.3.0", at 0x7fbad05fd088, in rb_nogvl
#10   Object "/usr/lib/x86_64-linux-gnu/ruby/3.0.0/fiddle.so", at 0x7fbad023ed6b, in
#9    Object "/lib/x86_64-linux-gnu/libffi.so.8", at 0x7fbad0230492, in
#8    Object "/lib/x86_64-linux-gnu/libffi.so.8", at 0x7fbad0233e2d, in
#7    Object "~/ws/fortress/install_jammy/lib/libignition-gazebo6-ign.so.6.16.0", at 0x7fbacc971aac, in runGui
#6    Object "~/ws/fortress/install_jammy/lib/libignition-gazebo6-gui.so.6", at 0x7fbacba89bcb, in ignition::gazebo::v6::gui::runGui(int&, char**, char const*, char const*, int, char const*)
#5    Object "/lib/x86_64-linux-gnu/libignition-gui6.so.6", at 0x7fbacb99a18c, in ignition::gui::Application::~Application()
#4    Object "/lib/x86_64-linux-gnu/libQt5Core.so.5", at 0x7fbaca4ef923, in QObject::~QObject()
#3    Object "/lib/x86_64-linux-gnu/libQt5Core.so.5", at 0x7fbaca4e4a6d, in QObjectPrivate::deleteChildren()
#2    Object "~/ws/fortress/install_jammy/lib/libignition-gazebo6-gui.so.6", at 0x7fbacba95d6c, in ignition::gazebo::v6::GuiRunner::~GuiRunner()
#1    Object "~/ws/fortress/install_jammy/lib/libignition-gazebo6-gui.so.6", at 0x7fbacba95d51, in ignition::gazebo::v6::GuiRunner::~GuiRunner()
#0    Object "~/ws/fortress/install_jammy/lib/libignition-gazebo6-gui.so.6", at 0x7fbacba9d85b, in void ignition::utils::detail::DefaultDelete<ignition::gazebo::v6::GuiRunner::Implementation>(ignition::gazebo::v6::GuiRunner::Implementation*)
Segmentation fault (Address not mapped to object [0x7fba9818ed50])
azeey commented 9 months ago

The backtrace indicates that this is related to https://github.com/gazebosim/gz-sim/issues/1443 which we solved in Garden by not deleting plugins from memory when they were unloaded, but it was considered a behavior change and was not backported to Fortress.

Backtrace ``` Thread 1 "ruby" received signal SIGSEGV, Segmentation fault. 0x00007ffff2f01438 in std::default_delete::operator() (this=0x7ffe750f2a50, __ptr=0x7ffe74909a00) at /usr/include/c++/11/bits/unique_ptr.h:85 85 delete __ptr; ~/ws/fortress/src/gz-gui - std::default_delete::operator() (gdb) bt #0 0x00007ffff2f01438 in std::default_delete::operator()(ignition::common::Event*) const (this=0x7ffe750f2a50, __ptr=0x7ffe74909a00) at /usr/include/c++/11/bits/unique_ptr.h:85 #1 0x00007ffff2e76d76 in std::unique_ptr >::~unique_ptr() (this=0x7ffe750f2a50, this=) at /usr/include/c++/11/bits/unique_ptr.h:361 #2 0x00007ffff2e76ab2 in std::pair const, std::unique_ptr > >::~pair() (this=0x7ffe750f2a48, this=) at /usr/include/c++/11/bits/stl_pair.h:211 #3 0x00007ffff2efe4b0 in __gnu_cxx::new_allocator const, std::unique_ptr > >, true> >::destroy const, std::unique_ptr > > >(std::pair const, std::unique_ptr > >*) (this=0x555556ade9f8, __p=0x7ffe750f2a48) at /usr/include/c++/11/ext/new_allocator.h:168 #4 0x00007ffff2efa57f in std::allocator_traits const, std::unique_ptr > >, true> > >::destroy const, std::unique_ptr > > >(std::allocator const, std::unique_ptr > >, true> >&, std::pair const, std::unique_ptr > >*) (__a=..., __p=0x7ffe750f2a48) at /usr/include/c++/11/bits/alloc_traits.h:535 #5 0x00007ffff2ef742b in std::__detail::_Hashtable_alloc const, std::unique_ptr > >, true> > >::_M_deallocate_node(std::__detail::_Hash_node const, std::unique_ptr > >, true>*) (this=0x555556ade9f8, __n=0x7ffe750f2a40) at /usr/include/c++/11/bits/hashtable_policy.h:1894 #6 0x00007ffff2ef3e19 in std::__detail::_Hashtable_alloc const, std::unique_ptr > >, true> > >::_M_deallocate_nodes(std::__detail::_Hash_node const, std::unique_ptr > >, true>*) (this=0x555556ade9f8, __n=0x0) at /usr/include/c++/11/bits/hashtable_policy.h:1916 #7 0x00007ffff2eef958 in std::_Hashtable, std::pair const, std::unique_ptr > >, std::allocator const, std::unique_ptr > > >, std::__detail::_Select1st, ignition::gazebo::v6::EventManager::EqualTo, ignition::gazebo::v6::EventManager::Hasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits >::clear() (this=0x555556ade9f8) at /usr/include/c++/11/bits/hashtable.h:2320 #8 0x00007ffff2e7431a in std::_Hashtable, std::pair const, std::unique_ptr > >, std::allocator const, std::unique_ptr > > >, std::__detail::_Select1st, ignition::gazebo::v6::EventManager::EqualTo, ignition::gazebo::v6::EventManager::Hasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits >::~_Hashtable() (this=0x555556ade9f8, this=) at /usr/include/c++/11/bits/hashtable.h:1532 #9 0x00007ffff2e737a2 in std::unordered_map, std::unique_ptr >, ignition::gazebo::v6::EventManager::Hasher, ignition::gazebo::v6::EventManager::EqualTo, std::allocator const, std::unique_ptr > > > >::~unordered_map() (this=0x555556ade9f8, this=) at /usr/include/c++/11/bits/unordered_map.h:102 #10 0x00007ffff2e737da in ignition::gazebo::v6::EventManager::~EventManager() (this=0x555556ade9f8, this=) at ~/ws/fortress/src/gz-sim/include/gz/sim/EventManager.hh:60 #11 0x00007ffff2e74358 in ignition::gazebo::v6::GuiRunner::Implementation::~Implementation() (this=0x555556ade890, this=) at ~/ws/fortress/src/gz-sim/src/gui/GuiRunner.cc:49 #12 0x00007ffff2eeadea in ignition::utils::detail::DefaultDelete(ignition::gazebo::v6::GuiRunner::Implementation*) (_ptr=0x555556ade890) at ~/ws/fortress/install_jammy/include/ignition/utils1/gz/utils/detail/DefaultOps.hh:48 #13 0x00007ffff2e739cc in std::unique_ptr::~unique_ptr() (this=0x555556adbae0, this=) at /usr/include/c++/11/bits/unique_ptr.h:361 #14 0x00007ffff2e72d60 in ignition::gazebo::v6::GuiRunner::~GuiRunner() (this=0x555556adbad0, this=) at ~/ws/fortress/src/gz-sim/src/gui/GuiRunner.cc:162 #15 0x00007ffff2ec2854 in ignition::gazebo::v6::GuiRunner::~GuiRunner() (this=0x555556adbad0, this=) at ~/ws/fortress/src/gz-sim/src/gui/GuiRunner.cc:162 #16 0x00007ffff16e4a6e in QObjectPrivate::deleteChildren() () at /lib/x86_64-linux-gnu/libQt5Core.so.5 #17 0x00007ffff16ef924 in QObject::~QObject() () at /lib/x86_64-linux-gnu/libQt5Core.so.5 #18 0x00007ffff2d11b9d in ignition::gui::Application::~Application() (this=0x555555df5980, __in_chrg=) at ~/ws/fortress/src/gz-gui/src/Application.cc:175 #19 0x00007ffff2eafbb6 in std::default_delete::operator()(ignition::gui::Application*) const (this=0x7fffffffc718, __ptr=0x555555df5980) at /usr/include/c++/11/bits/unique_ptr.h:85 #20 0x00007ffff2e60e78 in std::unique_ptr >::~unique_ptr() (this=0x7fffffffc718, this=) at /usr/include/c++/11/bits/unique_ptr.h:361 #21 0x00007ffff2eaeabf in ignition::gazebo::v6::gui::runGui(int&, char**, char const*, char const*, int, char const*) (_argc=@0x7fffffffc79c: 1, _argv=0x7fffffffc7a0, _guiConfig=0x5555559a62e0 "", _sdfFile=0x5555559a63f8 "", _waitGui=0, _renderEngine=0x5555559a6290 "") at ~/ws/fortress/src/gz-sim/src/gui/Gui.cc:477 #22 0x00007ffff334dd03 in runGui(char const*, char const*, int, char const*) (_guiConfig=0x5555559a62e0 "", _file=0x5555559a63f8 "", _waitGui=0, _renderEngine=0x5555559a6290 "") at ~/ws/fortress/src/gz-sim/src/gz.cc:429 #23 0x00007ffff7a3de2e in () at /lib/x86_64-linux-gnu/libffi.so.8 #24 0x00007ffff7a3a493 in () at /lib/x86_64-linux-gnu/libffi.so.8 #25 0x00007ffff7a48d6c in () at /usr/lib/x86_64-linux-gnu/ruby/3.0.0/fiddle.so #26 0x00007ffff7dfd089 in rb_nogvl () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #27 0x00007ffff7a4944c in () at /usr/lib/x86_64-linux-gnu/ruby/3.0.0/fiddle.so #28 0x00007ffff7e2fc35 in () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #29 0x00007ffff7e31fc6 in () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #30 0x00007ffff7e34c97 in () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #31 0x00007ffff7e3a30d in rb_vm_exec () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #32 0x00007ffff7ca5318 in () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #33 0x00007ffff7ca8e1a in ruby_run_node () at /lib/x86_64-linux-gnu/libruby-3.0.so.3.0 #34 0x000055555555517f in () #35 0x00007ffff7829d90 in __libc_start_call_main (main=main@entry=0x555555555120, argc=argc@entry=4, argv=argv@entry=0x7fffffffd008) at ../sysdeps/nptl/libc_start_call_main.h:58 #36 0x00007ffff7829e40 in __libc_start_main_impl (main=0x555555555120, argc=4, argv=0x7fffffffd008, init=, fini=, rtld_fini=, stack_end=0x7fffffffcff8) at ../csu/libc-start.c:392 #37 0x00005555555551c5 in _start () ~/ws/fortress/src/gz-gui - std::default_delete::operator() ```

So we have two solutions:

  1. Disable lto in our debs
  2. Backport https://github.com/gazebosim/gz-plugin/pull/102 and https://github.com/gazebosim/gz-gui/pull/469/ to Fortress.

I'm leaning toward (1) to avoid behavior changes. Also note that lto was added to deb builds starting in dpkg 1.21.0 (see https://wiki.debian.org/ToolChain/LTO). Focal has 1.19.7, so we are building without lto there and the crash doesn't occur on Focal. But Jammy has 1.21.1 which is where we're seeing this crash. However, going with (1) might mean reduced performance, but we haven't done any testing with lto, so it's not clear how much of a performance we'd get.