ros-controls / ros2_controllers

Generic robotic controllers to accompany ros2_control
https://control.ros.org
Apache License 2.0
321 stars 290 forks source link

ros2_control_node crashes abruptly with error "what(): Taking data from action server but nothing is ready #981

Open watanabeyyy opened 5 months ago

watanabeyyy commented 5 months ago

I am building an application with ros2 iron using moveit and ros2control.

The application I created usually works without errors, but "sometimes" it crashes abruptly.

The errors at that time are as follows.

I would appreciate any insight you can give me.

[move_group-2] [INFO] [1705464328.148369918] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller '/A_joint_trajectory_controller' successfully finished 
[move_group-2] [INFO] [1705464328.148765774] [moveit.simple_controller_manager.follow_joint_trajectory_controller_handle]: Controller '/B_joint_trajectory_controller' successfully finished 
[ros2_control_node-3] terminate called after throwing an instance of 'std::runtime_error' 
[ros2_control_node-3] what(): Taking data from action server but nothing is ready 
[ros2_control_node-3] Stack trace (most recent call last): 
[ros2_control_node-3] #15 Object "", at 0xffffffffffffffff, in 
[ros2_control_node-3] #14 Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x55d77ba19e04, in _start 
[ros2_control_node-3] #13 Source "../csu/libc-start.c", line 392, in __libc_start_main_impl [0x7f9969029e3f] 
[ros2_control_node-3] #12 Source "../sysdeps/nptl/libc_start_call_main.h", line 58, in__libc_start_call_main [0x7f9969029d8f] 
[ros2_control_node-3] #11 Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x55d77ba199ef, in main 
[ros2_control_node-3] #10 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7f996990ee14, in rclcpp::executors::MultiThreadedExecutor::spin() 
[ros2_control_node-3] #9 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7f996990e9b1, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) 
[ros2_control_node-3] #8 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7f99699059fe, in rclcpp::Executor::get_next_ready_executable_from_map(rclcpp::AnyExecutable&, std::map<std::weak_ptrrclcpp::CallbackGroup, std::weak_ptrrclcpp::node_interfaces::NodeBaseInterface, std::owner_less<std::weak_ptrrclcpp::CallbackGroup >, std::allocator<std::pair<std::weak_ptrrclcpp::CallbackGroup const, std::weak_ptrrclcpp::node_interfaces::NodeBaseInterface > > > const&) 
[ros2_control_node-3] #7 Object "/opt/ros/iron/lib/librclcpp_action.so", at 0x7f99345c4785, in 
[ros2_control_node-3] #6 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f99694ae4d7, in __cxa_throw 
[ros2_control_node-3] #5 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f99694ae276, in std::terminate() 
[ros2_control_node-3] #4 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f99694ae20b, in 
[ros2_control_node-3] #3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7f99694a2b9d, in 
[ros2_control_node-3] #2 Source "./stdlib/abort.c", line 79, in abort [0x7f99690287f2] 
[ros2_control_node-3] #1 Source "../sysdeps/posix/raise.c", line 26, in raise [0x7f9969042475] 
[ros2_control_node-3] #0 | Source "./nptl/pthread_kill.c", line 89, in__pthread_kill_internal 
[ros2_control_node-3] | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation 
[ros2_control_node-3] Source "./nptl/pthread_kill.c", line 44, in__pthread_kill [0x7f99690969fc] 
[ros2_control_node-3] Aborted (Signal sent by tkill() 180811 1000) 
[ERROR] [ros2_control_node-3]: process has died [pid 180811, exit code -6, cmd '/opt/ros/iron/lib/controller_manager/ros2_control_node --ros-args --params-file /tmp/launch_params_44ibrmz5 --params-file /home/user/ros2_ws/***/ros2_control_config.yaml'].
christophfroehlich commented 5 months ago

@saikishor @destogl @bmagyar I yet don't know why, but if the action server of JTC throws this std::runtime_error, can this kill the controller manager and then ros2_control_node? How is this handled by pluginlib?

saikishor commented 5 months ago

@christophfroehlich Yes, this will happen. I remember us discussing this topic to make it fail-proof. I had a similar experience but with a different controller a few months back. This is the to-do list for the Jazzy and also to backport.

zacharyyamaoka commented 3 months ago

@watanabeyyy I am having the same problem as you...

Have you found any work arounds yet?

Thanks!

[ros2_control_node-1] terminate called after throwing an instance of 'std::runtime_error'
[ros2_control_node-1]   what():  Taking data from action server but nothing is ready
[ros2_control_node-1] Stack trace (most recent call last):
[ros2_control_node-1] #15   Object "", at 0xffffffffffffffff, in 
[ros2_control_node-1] #14   Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x59eda8974e04, in _start
[ros2_control_node-1] #13   Source "../csu/libc-start.c", line 392, in __libc_start_main_impl [0x7a83afa29e3f]
[ros2_control_node-1] #12   Source "../sysdeps/nptl/libc_start_call_main.h", line 58, in __libc_start_call_main [0x7a83afa29d8f]
[ros2_control_node-1] #11   Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x59eda89749ef, in main
[ros2_control_node-1] #10   Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030ea84, in rclcpp::executors::MultiThreadedExecutor::spin()
[ros2_control_node-1] #9    Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030e621, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
[ros2_control_node-1] #8    Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030585e, in rclcpp::Executor::get_next_ready_executable_from_map(rclcpp::AnyExecutable&, std::map<std::weak_ptr<rclcpp::CallbackGroup>, std::weak_ptr<rclcpp::node_interfaces::NodeBaseInterface>, std::owner_less<std::weak_ptr<rclcpp::CallbackGroup> >, std::allocator<std::pair<std::weak_ptr<rclcpp::CallbackGroup> const, std::weak_ptr<rclcpp::node_interfaces::NodeBaseInterface> > > > const&)
[ros2_control_node-1] #7    Object "/opt/ros/iron/lib/librclcpp_action.so", at 0x7a83a0259785, in 
[ros2_control_node-1] #6    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae4d7, in __cxa_throw
[ros2_control_node-1] #5    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae276, in std::terminate()
[ros2_control_node-1] #4    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae20b, in 
[ros2_control_node-1] #3    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afea2b9d, in 
[ros2_control_node-1] #2    Source "./stdlib/abort.c", line 79, in abort [0x7a83afa287f2]
[ros2_control_node-1] #1    Source "../sysdeps/posix/raise.c", line 26, in raise [0x7a83afa42475]
[ros2_control_node-1] #0  | Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
[ros2_control_node-1]     | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
[ros2_control_node-1]       Source "./nptl/pthread_kill.c", line 44, in __pthread_kill [0x7a83afa969fc]
[ros2_control_node-1] Aborted (Signal sent by tkill() 15536 1000)
bmagyar commented 3 months ago

Can we reproduce this error with the demos repo?

On Tue, 19 Mar 2024, 18:15 Zachary Yamaoka, @.***> wrote:

@watanabeyyy https://github.com/watanabeyyy I am having the same problem as you...

Have you found any work arounds yet?

Thanks!

[ros2_control_node-1] terminate called after throwing an instance of 'std::runtime_error' [ros2_control_node-1] what(): Taking data from action server but nothing is ready [ros2_control_node-1] Stack trace (most recent call last): [ros2_control_node-1] #15 https://github.com/ros-controls/ros2_controllers/pull/15 Object "", at 0xffffffffffffffff, in [ros2_control_node-1] #14 https://github.com/ros-controls/ros2_controllers/pull/14 Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x59eda8974e04, in _start [ros2_control_node-1] #13 https://github.com/ros-controls/ros2_controllers/pull/13 Source "../csu/libc-start.c", line 392, in libc_start_main_impl [0x7a83afa29e3f] [ros2_control_node-1] #12 https://github.com/ros-controls/ros2_controllers/issues/12 Source "../sysdeps/nptl/libc_start_call_main.h", line 58, in __libc_start_call_main [0x7a83afa29d8f] [ros2_control_node-1] #11 https://github.com/ros-controls/ros2_controllers/pull/11 Object "/opt/ros/iron/lib/controller_manager/ros2_control_node", at 0x59eda89749ef, in main [ros2_control_node-1] #10 https://github.com/ros-controls/ros2_controllers/issues/10 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030ea84, in rclcpp::executors::MultiThreadedExecutor::spin() [ros2_control_node-1] #9 https://github.com/ros-controls/ros2_controllers/issues/9 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030e621, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long) [ros2_control_node-1] #8 https://github.com/ros-controls/ros2_controllers/pull/8 Object "/opt/ros/iron/lib/librclcpp.so", at 0x7a83b030585e, in rclcpp::Executor::get_next_ready_executable_from_map(rclcpp::AnyExecutable&, std::map<std::weak_ptrrclcpp::CallbackGroup, std::weak_ptrrclcpp::node_interfaces::NodeBaseInterface, std::owner_less, std::allocator<std::pair<std::weak_ptrrclcpp::CallbackGroup const, std::weak_ptrrclcpp::node_interfaces::NodeBaseInterface > > > const&) [ros2_control_node-1] #7 https://github.com/ros-controls/ros2_controllers/pull/7 Object "/opt/ros/iron/lib/librclcpp_action.so", at 0x7a83a0259785, in [ros2_control_node-1] #6 https://github.com/ros-controls/ros2_controllers/pull/6 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae4d7, in cxa_throw [ros2_control_node-1] #5 https://github.com/ros-controls/ros2_controllers/pull/5 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae276, in std::terminate() [ros2_control_node-1] #4 https://github.com/ros-controls/ros2_controllers/issues/4 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afeae20b, in [ros2_control_node-1] #3 https://github.com/ros-controls/ros2_controllers/pull/3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x7a83afea2b9d, in [ros2_control_node-1] #2 https://github.com/ros-controls/ros2_controllers/issues/2 Source "./stdlib/abort.c", line 79, in abort [0x7a83afa287f2] [ros2_control_node-1] #1 https://github.com/ros-controls/ros2_controllers/pull/1 Source "../sysdeps/posix/raise.c", line 26, in raise [0x7a83afa42475] [ros2_control_node-1] #0 | Source "./nptl/pthread_kill.c", line 89, in pthread_kill_internal [ros2_control_node-1] | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation [ros2_control_node-1] Source "./nptl/pthread_kill.c", line 44, in pthread_kill [0x7a83afa969fc] [ros2_control_node-1] Aborted (Signal sent by tkill() 15536 1000)

— Reply to this email directly, view it on GitHub https://github.com/ros-controls/ros2_controllers/issues/981#issuecomment-2007841610, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA24PYOQJN5MO4XSL3GRCYLYZB6EVAVCNFSM6AAAAABB7NFVSGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXHA2DCNRRGA . You are receiving this because you were mentioned.Message ID: @.***>

SyllogismRXS commented 2 months ago

I'm getting this error when running the latest joint_trajectory_controller in iron at random times. Haven't narrowed down the source of the problem yet.

bmagyar commented 2 months ago

Is it the same error with the same prints in the console or you are also getting a runtime error? Please copy logs if they are different

On Mon, 29 Apr 2024, 18:47 Kevin DeMarco, @.***> wrote:

I'm getting this error when running the latest joint_trajectory_controller in iron at random times. Haven't narrowed down the source of the problem yet.

— Reply to this email directly, view it on GitHub https://github.com/ros-controls/ros2_controllers/issues/981#issuecomment-2083308752, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA24PYIN2PO7XR3XRLMZSVDY72BUDAVCNFSM6AAAAABB7NFVSGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBTGMYDQNZVGI . You are receiving this because you were mentioned.Message ID: @.***>

christophfroehlich commented 2 months ago

I've seen the Taking data from action server but nothing is ready exception already a couple of times in the CI, but this was some time ago. And we haven't changed anything regarding JTC or realtime_tools on iron lately afaik.

SyllogismRXS commented 2 months ago

I've been using humble until about 3 weeks ago and I've been seeing this random error since then quite a few times. I've added a few extra log outputs to try to narrow down the problem. This occurs when I'm using rviz2/ MoveIt2 to send desired poses. Here is the log output:


[ros2_control_node-1] [INFO] [1714416894.528266646] [joint_trajectory_controller]: Starting goal_accepted_callback!
[ros2_control_node-1] [INFO] [1714416894.528396324] [joint_trajectory_controller]: Completed goal_accepted_callback!
[ros2_control_node-1] terminate called after throwing an instance of 'std::runtime_error'
[ros2_control_node-1]   what():  Taking data from action server but nothing is ready
[ros2_control_node-1] Stack trace (most recent call last) in thread 134319:
[ros2_control_node-1] #13   Object "", at 0xffffffffffffffff, in
[ros2_control_node-1] #12   Source "../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 81, in __clone3 [0x71f84caa784f]
[ros2_control_node-1] #11   Source "./nptl/pthread_create.c", line 442, in start_thread [0x71f84ca15ac2]
[ros2_control_node-1] #10   Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x71f84cca6252, in
[ros2_control_node-1] #9    Object "/opt/ros/iron/lib/librclcpp.so", at 0x71f84cf25571, in rclcpp::executors::MultiThreadedExecutor::run(unsigned long)
[ros2_control_node-1] #8    Object "/opt/ros/iron/lib/librclcpp.so", at 0x71f84cf1c7ae, in rclcpp::Executor::get_next_ready_executable_from_map(rclcpp::AnyExecutable&, std::map<std::weak_ptr<rclcpp::CallbackGroup>, std::weak_ptr<rclcpp::node_interfaces::NodeBaseInterface>, std::owner_less<std::weak_ptr<rclcpp::CallbackGroup> >, std::allocator<std::pair<std::weak_ptr<rclcpp::CallbackGroup> const, std::weak_ptr<rclcpp::node_interfaces::NodeBaseInterface> > > > const&)
[ros2_control_node-1] #7    Object "/opt/ros/iron/lib/librclcpp_action.so", at 0x71f81c047785, in
[ros2_control_node-1] #6    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x71f84cc784d7, in __cxa_throw
[ros2_control_node-1] #5    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x71f84cc78276, in std::terminate()
[ros2_control_node-1] #4    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x71f84cc7820b, in
[ros2_control_node-1] #3    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30", at 0x71f84cc6cb9d, in
[ros2_control_node-1] #2    Source "./stdlib/abort.c", line 79, in abort [0x71f84c9a97f2]
[ros2_control_node-1] #1    Source "../sysdeps/posix/raise.c", line 26, in raise [0x71f84c9c3475]
[ros2_control_node-1] #0  | Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
[ros2_control_node-1]     | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
[ros2_control_node-1]       Source "./nptl/pthread_kill.c", line 44, in __pthread_kill [0x71f84ca179fc]
[ros2_control_node-1] Aborted (Signal sent by tkill() 134093 1001)

It fails once about every ten times that I send a desired pose. Do you have a recommended way of finding the bug? I tried building debug symbols and running in gdb / gdb server, but the output isn't very helpful. I'm thinking about just building the joint_trajectory_controller from humble and seeing I get the same error.

saikishor commented 2 months ago

Hello!

It seems to be happening from the core packages side. This recently merged PR : https://github.com/ros2/rclcpp/pull/2495 might fix it.

Best Regards, Sai

firesurfer commented 1 month ago

Please note that the patch will likely not be backported to Iron as it introduces an ABI change.
It is still discussed here: https://github.com/ros2/rclcpp/issues/2242

EDIT: I created a PR for an Iron backport https://github.com/ros2/rclcpp/pull/2530