jacknlliu / development-issues

Some issues from my development
MIT License
7 stars 1 forks source link

joint trajectory controller error cause gazebo segmentation fault #22

Open jacknlliu opened 6 years ago

jacknlliu commented 6 years ago

When using joint trajectory controller with gazebo, an error occur

[ERROR] [1517823475.098197028, 20202.025000000]: To transition to a cancelled state, the goal must be in a pending, recalling, active, or preempting state, it is currently in state: 2

Segmentation fault (core dumped)

[gazebo-1] process has died [pid 1072, exit code 139, cmd /opt/ros/kinetic/lib/gazebo_ros/gzserver -e ode /data/assembler/assembler-dynamic/src/assembler_description/world/myworkcell-with-base-fix-physics.world __name:=gazebo __log:=/home/ros/.ros/log/93541f82-0a23-11e8-aa83-06d51cc02108/gazebo-1.log].
log file: /home/ros/.ros/log/93541f82-0a23-11e8-aa83-06d51cc02108/gazebo-1*.log

Reference

bmagyar commented 6 years ago

Could you add some more details please?

Which operating system? Which ROS version? Does it happen with the latest packages? Which robot are you running? Is it reproducible? If yes, how? Could you try running Gazebo with debug mode? May give you more clues. Are you sure that print comes from the joint_trajectory_controller? Can try looking at rqt_console and that allows you to click on the log line and see the origin of it.

jacknlliu commented 6 years ago

@bmagyar I got more details via gazebo debug mode

[Wrn] [Publisher.cc:140] Queue limit reached for topic /gazebo/default/pose/local/info, deleting message. This warning is printed only once.
[ERROR] [1518053045.869658665, 830.998000000]: To transition to a cancelled state, the goal must be in a pending, recalling, active, or preempting state, it is currently in state: 2

Thread 37 "gzserver" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff39e4b700 (LWP 420)]
0x00007fff16780841 in joint_trajectory_controller::JointTrajectoryController<trajectory_interface::QuinticSplineSegment<double>, hardware_interface::PositionJointInterface>::update(ros::Time const&, ros::Duration const&) ()
   from /opt/ros/kinetic/lib//libjoint_trajectory_controller.so

Information about my system

Ubuntu 16.04
ROS: kinetic
ros-kinetic-joint-trajectory-controller: 0.13.1-0xenial-20171130-120616-0800
Gazebo: 7.0.0+dfsg-2
ros-kinetic-gazebo-ros: 2.5.13-0xenial-20171117-053720-0800
ros-kinetic-gazebo-ros-control: 2.5.13-0xenial-20171130-121344-0800
ros-kinetic-gazebo-ros-pkgs: 2.5.13-0xenial-20171117-091119-0800

I used UR robot. It's reproducible, just sending joint random trajectory to robot 1-10 hours.

jacknlliu commented 6 years ago

more details from debug mode

[ERROR] [1518053045.869658665, 830.998000000]: To transition to a cancelled state, the goal must be in a pending, recalling, active, or preempting state, it is currently in state: 2

Thread 37 "gzserver" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff39e4b700 (LWP 420)]
0x00007fff16780841 in joint_trajectory_controller::JointTrajectoryController<trajectory_interface::QuinticSplineSegment<double>, hardware_interface::PositionJointInterface>::update(ros::Time const&, ros::Duration const&) ()
   from /opt/ros/kinetic/lib//libjoint_trajectory_controller.so
(gdb) bt
#0  0x00007fff16780841 in joint_trajectory_controller::JointTrajectoryController<trajectory_interface::QuinticSplineSegment<double>, hardware_interface::PositionJointInterface>::update(ros::Time const&, ros::Duration const&) ()
   from /opt/ros/kinetic/lib//libjoint_trajectory_controller.so
#1  0x00007fff25537837 in controller_manager::ControllerManager::update(ros::Time const&, ros::Duration const&, bool) ()
   from /opt/ros/kinetic/lib/libcontroller_manager.so
#2  0x00007fff25797e15 in gazebo_ros_control::GazeboRosControlPlugin::Update()
    () from /opt/ros/kinetic/lib/libgazebo_ros_control.so
#3  0x00007ffff62ca367 in boost::function1<void, gazebo::common::UpdateInfo const&>::operator() (a0=..., this=<optimized out>)
    at /usr/include/boost/function/function_template.hpp:773
#4  gazebo::event::EventT<void (gazebo::common::UpdateInfo const&)>::Signal<gazebo::common::UpdateInfo>(gazebo::common::UpdateInfo const&) (_p=..., 
    this=<optimized out>)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/common/Event.hh:380
#5  gazebo::event::EventT<void (gazebo::common::UpdateInfo const&)>::operator()<gazebo::common::UpdateInfo>(gazebo::common::UpdateInfo const&) (_p=..., 
    this=<optimized out>)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/common/Event.hh:216
#6  gazebo::physics::World::Update (this=this@entry=0x14e0610)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:740
#7  0x00007ffff62d89af in gazebo::physics::World::Step (
---Type <return> to continue, or q <return> to quit---
    this=this@entry=0x14e0610)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:672
#8  0x00007ffff62d8e25 in gazebo::physics::World::RunLoop (this=0x14e0610)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:481
#9  0x00007ffff40d85d5 in ?? ()
   from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#10 0x00007ffff65ed6ba in start_thread ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007ffff6bef3dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
jacknlliu commented 6 years ago

I think there are too much and too frequent(100HZ) joint trajectory replacement that caused joint trajectory controller failed.

[DEBUG] [1518068038.616189765, 9128.736000000]: Setting status to canceled on goal, id: /gym_ros_node-69741-9128.613, stamp: 9128.61
[DEBUG] [1518068038.616201253, 9128.736000000]: Publishing result for goal with id: /gym_ros_node-69741-9128.613 and stamp: 9128.61
[DEBUG] [1518068038.616248701, 9128.736000000]: Accepting goal, id: /gym_ros_node-69742-9128.735, stamp: 9128.74
[DEBUG] [1518068038.716752030, 9128.837000000]: Publishing feedback for goal, id: /gym_ros_node-69742-9128.735, stamp: 9128.74
[DEBUG] [1518068038.716778184, 9128.837000000]: Publishing feedback for goal with id: /gym_ros_node-69742-9128.735 and stamp: 9128.74
[DEBUG] [1518068038.735356354, 9128.855000000]: The action server has received a new goal request
[DEBUG] [1518068038.735439233, 9128.855000000]: Received new action goal
[DEBUG] [1518068038.735527017, 9128.855000000]: Figuring out new trajectory starting at time 9128.856
[DEBUG] [1518068038.735572017, 9128.855000000]: Using alternate time base. In it, the new trajectory starts at time 647949.864
[DEBUG] [1518068038.735627076, 9128.855000000]: Trajectory of joint elbow_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735654571, 9128.855000000]: Trajectory of joint shoulder_lift_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735674982, 9128.855000000]: Trajectory of joint shoulder_pan_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735693930, 9128.855000000]: Trajectory of joint wrist_1_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735712923, 9128.855000000]: Trajectory of joint wrist_2_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735731597, 9128.855000000]: Trajectory of joint wrist_3_jointhas 2 segments:
- 1 segment(s) will still be executed from previous trajectory.
- 1 segment added for transitioning between the current trajectory and first point of the input message.
[DEBUG] [1518068038.735761877, 9128.855000000]: Setting status to canceled on goal, id: /gym_ros_node-69742-9128.735, stamp: 9128.74
[DEBUG] [1518068038.735779706, 9128.855000000]: Publishing result for goal with id: /gym_ros_node-69742-9128.735 and stamp: 9128.74
[DEBUG] [1518068038.736060106, 9128.856000000]: Accepting goal, id: /gym_ros_node-69743-9128.855, stamp: 9128.85

Thread 37 "gzserver" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff39e4b700 (LWP 1008)]
0x00007fff1a4ad80b in joint_trajectory_controller::JointTrajectoryController<trajectory_interface::QuinticSplineSegment<double>, hardware_interface::PositionJointInterface>::update(ros::Time const&, ros::Duration const&) () from /opt/ros/kinetic/lib//libjoint_trajectory_controller.so
(gdb) bt
#0  0x00007fff1a4ad80b in joint_trajectory_controller::JointTrajectoryController<trajectory_interface::QuinticSplineSegment<double>, hardware_interface::PositionJointInterface>::update(ros::Time const&, ros::Duration const&) () from /opt/ros/kinetic/lib//libjoint_trajectory_controller.so
#1  0x00007fff1c8c1837 in controller_manager::ControllerManager::update(ros::Time const&, ros::Duration const&, bool) () from /opt/ros/kinetic/lib/libcontroller_manager.so
#2  0x00007fff1cb21e15 in gazebo_ros_control::GazeboRosControlPlugin::Update() () from /opt/ros/kinetic/lib/libgazebo_ros_control.so
#3  0x00007ffff62ca367 in boost::function1<void, gazebo::common::UpdateInfo const&>::operator() (a0=..., this=<optimized out>) at /usr/include/boost/function/function_template.hpp:773
#4  gazebo::event::EventT<void (gazebo::common::UpdateInfo const&)>::Signal<gazebo::common::UpdateInfo>(gazebo::common::UpdateInfo const&) (_p=..., this=<optimized out>)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/common/Event.hh:380
#5  gazebo::event::EventT<void (gazebo::common::UpdateInfo const&)>::operator()<gazebo::common::UpdateInfo>(gazebo::common::UpdateInfo const&) (_p=..., this=<optimized out>)
    at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/common/Event.hh:216
#6  gazebo::physics::World::Update (this=this@entry=0x14e09c0) at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:740
#7  0x00007ffff62d89af in gazebo::physics::World::Step (this=this@entry=0x14e09c0) at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:672
#8  0x00007ffff62d8e25 in gazebo::physics::World::RunLoop (this=0x14e09c0) at /build/gazebo-nhSAPd/gazebo-7.0.0+dfsg/gazebo/physics/World.cc:481
#9  0x00007ffff40d85d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#10 0x00007ffff65ed6ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007ffff6bef3dd in clone () from /lib/x86_64-linux-gnu/libc.so.6

Some more details from rqt_console about /gazebo:

Node: /gazebo
Time: 00:10:13.593000000 (1970-01-01)
Severity: Debug
Published Topics: /3dcamera/depth/camera_info, /3dcamera/depth/image_raw, /3dcamera/depth/points, /3dcamera/parameter_descriptions, /3dcamera/parameter_updates, /3dcamera/rgb/camera_info, /3dcamera/rgb/image_raw, /3dcamera/rgb/image_raw/compressed, /3dcamera/rgb/image_raw/compressed/parameter_descriptions, /3dcamera/rgb/image_raw/compressed/parameter_updates, /3dcamera/rgb/image_raw/compressedDepth, /3dcamera/rgb/image_raw/compressedDepth/parameter_descriptions, /3dcamera/rgb/image_raw/compressedDepth/parameter_updates, /3dcamera/rgb/image_raw/theora, /3dcamera/rgb/image_raw/theora/parameter_descriptions, /3dcamera/rgb/image_raw/theora/parameter_updates, /arm_gazebo_controller/follow_joint_trajectory/feedback, /arm_gazebo_controller/follow_joint_trajectory/result, /arm_gazebo_controller/follow_joint_trajectory/status, /arm_gazebo_controller/state, /clock, /ft_sensor_topic, /gazebo/link_states, /gazebo/model_states, /gazebo/parameter_descriptions, /gazebo/parameter_updates, /joint_states, /rosout

Figuring out new trajectory starting at time 613.594

Location:
/tmp/binarydeb/ros-kinetic-joint-trajectory-controller-0.13.1/include/joint_trajectory_controller/init_joint_trajectory.h:initJointTrajectory:206

----------------------------------------------------------------------------------------------------

Node: /gazebo
Time: 00:10:13.593000000 (1970-01-01)
Severity: Debug
Published Topics: /3dcamera/depth/camera_info, /3dcamera/depth/image_raw, /3dcamera/depth/points, /3dcamera/parameter_descriptions, /3dcamera/parameter_updates, /3dcamera/rgb/camera_info, /3dcamera/rgb/image_raw, /3dcamera/rgb/image_raw/compressed, /3dcamera/rgb/image_raw/compressed/parameter_descriptions, /3dcamera/rgb/image_raw/compressed/parameter_updates, /3dcamera/rgb/image_raw/compressedDepth, /3dcamera/rgb/image_raw/compressedDepth/parameter_descriptions, /3dcamera/rgb/image_raw/compressedDepth/parameter_updates, /3dcamera/rgb/image_raw/theora, /3dcamera/rgb/image_raw/theora/parameter_descriptions, /3dcamera/rgb/image_raw/theora/parameter_updates, /arm_gazebo_controller/follow_joint_trajectory/feedback, /arm_gazebo_controller/follow_joint_trajectory/result, /arm_gazebo_controller/follow_joint_trajectory/status, /arm_gazebo_controller/state, /clock, /ft_sensor_topic, /gazebo/link_states, /gazebo/model_states, /gazebo/parameter_descriptions, /gazebo/parameter_updates, /joint_states, /rosout

Received new action goal

Location:
/tmp/binarydeb/ros-kinetic-joint-trajectory-controller-0.13.1/include/joint_trajectory_controller/joint_trajectory_controller_impl.h:JointTrajectoryController<SegmentImpl, HardwareInterface>::goalCB:572

----------------------------------------------------------------------------------------------------

Node: /gazebo
Time: 00:10:13.593000000 (1970-01-01)
Severity: Debug
Published Topics: /3dcamera/depth/camera_info, /3dcamera/depth/image_raw, /3dcamera/depth/points, /3dcamera/parameter_descriptions, /3dcamera/parameter_updates, /3dcamera/rgb/camera_info, /3dcamera/rgb/image_raw, /3dcamera/rgb/image_raw/compressed, /3dcamera/rgb/image_raw/compressed/parameter_descriptions, /3dcamera/rgb/image_raw/compressed/parameter_updates, /3dcamera/rgb/image_raw/compressedDepth, /3dcamera/rgb/image_raw/compressedDepth/parameter_descriptions, /3dcamera/rgb/image_raw/compressedDepth/parameter_updates, /3dcamera/rgb/image_raw/theora, /3dcamera/rgb/image_raw/theora/parameter_descriptions, /3dcamera/rgb/image_raw/theora/parameter_updates, /arm_gazebo_controller/follow_joint_trajectory/feedback, /arm_gazebo_controller/follow_joint_trajectory/result, /arm_gazebo_controller/follow_joint_trajectory/status, /arm_gazebo_controller/state, /clock, /ft_sensor_topic, /gazebo/link_states, /gazebo/model_states, /gazebo/parameter_descriptions, /gazebo/parameter_updates, /joint_states, /rosout

The action server has received a new goal request

Location:
/opt/ros/kinetic/include/actionlib/server/action_server_base.h:ActionServerBase<ActionSpec>::goalCallback:212
bmagyar commented 6 years ago

Thanks for the info. Why are you sending commands to the controller at 100Hz? If you want it to follow a trajectory of points, pass those in with properly set up time_from_start fields for nice timing.

Still, I don't remember ever seeing this issue before and there are so many robots running this controller out there...

jacknlliu commented 6 years ago

@bmagyar Thank you very much for your reply for this issue. Actually, I only send a trajectory point to robot every time, my program can't compute a whole trajectory points to do that with time_from_start. Do you have a better suggestion for this case?

jacknlliu commented 6 years ago

@bmagyar do you think that it's a bug from joint trajectory controller no matter how we use it?

bmagyar commented 6 years ago

I frankly have no clue on this one. There are many things one may suspect since the setups is quite complex. My guess is that it's something related to passing data around which causes the computation to fail.

You could try by compiling the ros_control in debug mode and running the experiment again. GDB should serve us with more information then.

jacknlliu commented 6 years ago

I will try it. I hope this will find more things to be improved. Thanks!