Open giafranchini opened 9 months ago
Can you please provide some example code that shows the problem? That would help a lot in debugging this.
It's difficult to provide a simple example since fuse is a quite articulated package and a lot of its subsytems are needed to reproduce. I can try to explain better:
There is a main node, the optimizer, which during construction:
During plugins initialization, a new executor and callback group are created with the same context as the main node, and a new thread is started to let the executor spin. The process can be found here: https://github.com/giafranchini/fuse/blob/25b4b9ec4a36886a726f661818cbff4abeebb7d4/fuse_core/src/async_sensor_model.cpp#L59
https://github.com/ros2/rclcpp/issues/2242 looks similar problem. Edit: probably not, sorry
I setted up a minimal example in which I am able to reproduce the error. Please follow these steps:
feature-mt-executor
branchros2 launch fuse_tutorials fuse_component_mt.launch.py
This will run a simple example using bags already available in the package. I also report here the core part of the launch file:
container_fuse = ComposableNodeContainer(
name='container_fuse',
namespace='',
package='rclcpp_components',
executable='component_container_mt',
composable_node_descriptions=[
ComposableNode(
package='fuse_optimizers',
plugin='fuse_optimizers::FixedLagSmootherComponent',
name='state_estimator',
parameters=[PathJoinSubstitution([
pkg_dir, 'config', 'fuse_simple_tutorial.yaml'
])],
extra_arguments=[{'use_intra_process_comms': True}],
),
],
output='both',
)
@giafranchini could you also provide a backtrace for the error?
Sure, here it is:
[component_container_mt-1] [INFO] [1707748366.188400210] [container_fuse]: Load Library: /home/giacomo/ros2_iron_ws/install/fuse_optimizers/lib/libfixed_lag_smoother_component.so
[component_container_mt-1] [INFO] [1707748366.258572968] [container_fuse]: Found class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.258688353] [container_fuse]: Instantiate class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.447123820] [state_estimator]: Received a set_pose request (stamp: 1703239942730964210, x: 0, y: 0, z: 0, roll: 0, pitch: 0, yaw: 0)
[component_container_mt-1] [WARN] [1707748366.468987638] [fuse]: Second argument to joinTopicName is absolute! Returning it.
[component_container_mt-1] [INFO] [1707748366.471217372] [state_estimator]: FixedLagSmootherComponent constructed
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/state_estimator' in container '/container_fuse'
[component_container_mt-1] terminate called after throwing an instance of 'std::runtime_error'
[component_container_mt-1] what(): 'data' is empty
[ERROR] [component_container_mt-1]: process has died [pid 76513, exit code -6, cmd '/home/giacomo/ros2_iron/install/rclcpp_components/lib/rclcpp_components/component_container_mt --ros-args -r __node:=container_fuse -r __ns:=/ -p use_sim_time:=True'].
I also noticed that the error does not show up immediately every time I launch the container: sometimes the node is able to run for some small time before dying with the same error.
@giafranchini that's not a backtrace, but just the console log. See here a tutorial on how to get a backtrace: https://navigation.ros.org/tutorials/docs/get_backtrace.html
Sorry, here it is:
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737104520768)
at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140737104520768)
at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140737104520768, signo=signo@entry=6)
at ./nptl/pthread_kill.c:89
#3 0x00007ffff6642476 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/posix/raise.c:26
#4 0x00007ffff66287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff6aa2b9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff6aae20c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007ffff6aae277 in std::terminate() ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007ffff6aae4d8 in __cxa_throw ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007fffbd88e75d in fuse_core::CallbackAdapter::execute (
this=<optimized out>, data=...)
at /home/giacomo/ros2_iron_ws/src/fuse/fuse_core/src/callback_wrapper.cpp:113
#10 0x00007ffff790b95c in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#11 0x00007ffff7936b91 in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#12 0x00007ffff7938c4d in void std::__invoke_impl<void, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(std::__invoke_memfun_deref, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#13 0x00007ffff7938b5f in std::__invoke_result<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>::type std::__invoke<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#14 0x00007ffff7938a68 in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>)
() from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#15 0x00007ffff79389ab in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::operator()<, void>() ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#16 0x00007ffff7938952 in void std::__invoke_impl<void, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, --Type <RET> for more, q to quit, c to continue without paging--
unsigned long))(unsigned long)>>(std::__invoke_other, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&) ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#17 0x00007ffff79388fb in std::__invoke_result<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>::type std::__invoke<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>(std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&)
() from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#18 0x00007ffff793889c in void std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::_M_invoke<0ul>(std::_Index_tuple<0ul>) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#19 0x00007ffff793886c in std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::operator()() ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#20 0x00007ffff793884c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > > >::_M_run() ()
from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#21 0x00007ffff6adc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#22 0x00007ffff6694ac3 in start_thread (arg=<optimized out>)
at ./nptl/pthread_create.c:442
#23 0x00007ffff6726850 in clone3 ()
at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
I'm seeing this also with Ubuntu 22.04, ROS Iron from ubuntu binaries, and the multi-threaded executor.
move_group-3] terminate called after throwing an instance of 'std::runtime_error'
[move_group-3] what(): 'data' is empty
[move_group-3] Stack trace (most recent call last) in thread 1495:
[move_group-3] #13 Object "/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1", at 0xffffffffffffffff, in
[move_group-3] #12 Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5fb5edb, in
[move_group-3] #11 Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5f4d5c7, in
[move_group-3] #10 Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0xffffb61831fb, in
[move_group-3] #9 Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb643b32b, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
[move_group-3] #8 Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb642b297, in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&)
[move_group-3] #7 Object "/opt/ros/iron/lib/librclcpp_action.so", at 0xffffb48b0b0b, in rclcpp_action::ServerBase::exec
@giafranchini, found a work-around other than not using the multi-threaded executor?
I have my own thread publishing JointStates -- is that problematic?
@dsandber I found out that in my application there is a class which inherits from rclcpp::Waitable
which is causing the error since sometimes with the multi-threaded executor the data argument in the class execute
method is nullptr
, but I am not sure why this is happening. Anyway, I was able to make it work adding the waitables to a dedicated Reentrant callback group. Hope this helps
My setup:
I am developing a state estimation application based on fuse package on my robot. I would like to have two separate callback groups, a dedicated one for a
rclpp::Timer
here, and the node default one for handling callbacks fromrclcpp::Subscriptions
. In order for the two to work in parallel, I would like to use a multithreaded executor with num_threads = 2. However, when I launch my node inside acomponent_container_mt
it shuts down with the following error:[component_container_mt-3] terminate called after throwing an instance of 'std::runtime_error' [component_container_mt-3] what(): 'data' is empty
I am running into this issue while deploying the package on my robot architecture, which is a VersaLogic Copperhead VL-EBX-41.