ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
563 stars 426 forks source link

std::runtime_error what(): 'data' is empty with component container multithread #2423

Open giafranchini opened 9 months ago

giafranchini commented 9 months ago

My setup:

I am developing a state estimation application based on fuse package on my robot. I would like to have two separate callback groups, a dedicated one for a rclpp::Timer here, and the node default one for handling callbacks from rclcpp::Subscriptions. In order for the two to work in parallel, I would like to use a multithreaded executor with num_threads = 2. However, when I launch my node inside a component_container_mt it shuts down with the following error:

[component_container_mt-3] terminate called after throwing an instance of 'std::runtime_error' [component_container_mt-3] what(): 'data' is empty

I am running into this issue while deploying the package on my robot architecture, which is a VersaLogic Copperhead VL-EBX-41.

clalancette commented 9 months ago

Can you please provide some example code that shows the problem? That would help a lot in debugging this.

giafranchini commented 9 months ago

It's difficult to provide a simple example since fuse is a quite articulated package and a lot of its subsytems are needed to reproduce. I can try to explain better:

There is a main node, the optimizer, which during construction:

During plugins initialization, a new executor and callback group are created with the same context as the main node, and a new thread is started to let the executor spin. The process can be found here: https://github.com/giafranchini/fuse/blob/25b4b9ec4a36886a726f661818cbff4abeebb7d4/fuse_core/src/async_sensor_model.cpp#L59

fujitatomoya commented 9 months ago

https://github.com/ros2/rclcpp/issues/2242 looks similar problem. Edit: probably not, sorry

giafranchini commented 9 months ago

I setted up a minimal example in which I am able to reproduce the error. Please follow these steps:

This will run a simple example using bags already available in the package. I also report here the core part of the launch file:

container_fuse = ComposableNodeContainer(
        name='container_fuse',
        namespace='',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[
            ComposableNode(
                package='fuse_optimizers',
                plugin='fuse_optimizers::FixedLagSmootherComponent',
                name='state_estimator',
                parameters=[PathJoinSubstitution([
                    pkg_dir, 'config', 'fuse_simple_tutorial.yaml'
                ])],
                extra_arguments=[{'use_intra_process_comms': True}],
            ),
        ],
        output='both',
)
alsora commented 9 months ago

@giafranchini could you also provide a backtrace for the error?

giafranchini commented 9 months ago

Sure, here it is:

[component_container_mt-1] [INFO] [1707748366.188400210] [container_fuse]: Load Library: /home/giacomo/ros2_iron_ws/install/fuse_optimizers/lib/libfixed_lag_smoother_component.so
[component_container_mt-1] [INFO] [1707748366.258572968] [container_fuse]: Found class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.258688353] [container_fuse]: Instantiate class: rclcpp_components::NodeFactoryTemplate<fuse_optimizers::FixedLagSmootherComponent>
[component_container_mt-1] [INFO] [1707748366.447123820] [state_estimator]: Received a set_pose request (stamp: 1703239942730964210, x: 0, y: 0, z: 0, roll: 0, pitch: 0, yaw: 0)
[component_container_mt-1] [WARN] [1707748366.468987638] [fuse]: Second argument to joinTopicName is absolute! Returning it.
[component_container_mt-1] [INFO] [1707748366.471217372] [state_estimator]: FixedLagSmootherComponent constructed
[INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/state_estimator' in container '/container_fuse'
[component_container_mt-1] terminate called after throwing an instance of 'std::runtime_error'
[component_container_mt-1]   what():  'data' is empty
[ERROR] [component_container_mt-1]: process has died [pid 76513, exit code -6, cmd '/home/giacomo/ros2_iron/install/rclcpp_components/lib/rclcpp_components/component_container_mt --ros-args -r __node:=container_fuse -r __ns:=/ -p use_sim_time:=True'].

I also noticed that the error does not show up immediately every time I launch the container: sometimes the node is able to run for some small time before dying with the same error.

alsora commented 9 months ago

@giafranchini that's not a backtrace, but just the console log. See here a tutorial on how to get a backtrace: https://navigation.ros.org/tutorials/docs/get_backtrace.html

giafranchini commented 9 months ago

Sorry, here it is:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737104520768)
    at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737104520768)
    at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737104520768, signo=signo@entry=6)
    at ./nptl/pthread_kill.c:89
#3  0x00007ffff6642476 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#4  0x00007ffff66287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff6aa2b9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff6aae20c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff6aae277 in std::terminate() ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff6aae4d8 in __cxa_throw ()
   from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fffbd88e75d in fuse_core::CallbackAdapter::execute (
    this=<optimized out>, data=...)
    at /home/giacomo/ros2_iron_ws/src/fuse/fuse_core/src/callback_wrapper.cpp:113
#10 0x00007ffff790b95c in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#11 0x00007ffff7936b91 in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#12 0x00007ffff7938c4d in void std::__invoke_impl<void, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(std::__invoke_memfun_deref, void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#13 0x00007ffff7938b5f in std::__invoke_result<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>::type std::__invoke<void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&>(void (rclcpp::executors::MultiThreadedExecutor::*&)(unsigned long), rclcpp::executors::MultiThreadedExecutor*&, unsigned long&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#14 0x00007ffff7938a68 in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>)
    () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#15 0x00007ffff79389ab in void std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>::operator()<, void>() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#16 0x00007ffff7938952 in void std::__invoke_impl<void, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, --Type <RET> for more, q to quit, c to continue without paging--
unsigned long))(unsigned long)>>(std::__invoke_other, std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&) ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#17 0x00007ffff79388fb in std::__invoke_result<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>::type std::__invoke<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>>(std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)>&&)
    () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#18 0x00007ffff793889c in void std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::_M_invoke<0ul>(std::_Index_tuple<0ul>) () from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#19 0x00007ffff793886c in std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > >::operator()() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#20 0x00007ffff793884c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (rclcpp::executors::MultiThreadedExecutor::*(rclcpp::executors::MultiThreadedExecutor*, unsigned long))(unsigned long)> > > >::_M_run() ()
   from /home/giacomo/ros2_iron/install/rclcpp/lib/librclcpp.so
#21 0x00007ffff6adc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#22 0x00007ffff6694ac3 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:442
#23 0x00007ffff6726850 in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
dsandber commented 8 months ago

I'm seeing this also with Ubuntu 22.04, ROS Iron from ubuntu binaries, and the multi-threaded executor.

move_group-3] terminate called after throwing an instance of 'std::runtime_error'
[move_group-3]   what():  'data' is empty
[move_group-3] Stack trace (most recent call last) in thread 1495:
[move_group-3] #13   Object "/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1", at 0xffffffffffffffff, in
[move_group-3] #12   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5fb5edb, in
[move_group-3] #11   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffb5f4d5c7, in
[move_group-3] #10   Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0xffffb61831fb, in
[move_group-3] #9    Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb643b32b, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
[move_group-3] #8    Object "/opt/ros/iron/lib/librclcpp.so", at 0xffffb642b297, in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&)
[move_group-3] #7    Object "/opt/ros/iron/lib/librclcpp_action.so", at 0xffffb48b0b0b, in rclcpp_action::ServerBase::exec

@giafranchini, found a work-around other than not using the multi-threaded executor?

I have my own thread publishing JointStates -- is that problematic?

giafranchini commented 8 months ago

@dsandber I found out that in my application there is a class which inherits from rclcpp::Waitable which is causing the error since sometimes with the multi-threaded executor the data argument in the class execute method is nullptr, but I am not sure why this is happening. Anyway, I was able to make it work adding the waitables to a dedicated Reentrant callback group. Hope this helps