fzi-forschungszentrum-informatik / cartesian_controllers

A set of Cartesian controllers for the ROS1 and ROS2-control framework.
BSD 3-Clause "New" or "Revised" License
397 stars 118 forks source link

Safe Setup for UR10e Hardware with Cartesian Controllers? #210

Open zirogravity opened 2 months ago

zirogravity commented 2 months ago

Problem description: Erratic robot motion before immediate faulting when switching from default STC to Cartesian Controller

Software versions: Commit: dc803377752bcbc175004a549485ef011ec4a952 OS: Linux 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 ROS Version: ROS2 Humble

Reproduce: Please see this post: UR Forum Post

Expected behavior: Robot not to fault when switching controllers and ready for testing Cartesian Controllers via RQT

Update: I am not exactly clear what I did differently so far but I managed to get "a reaction" from URSim but unclear what the messages mean if anything. I will report back as I learn more.

Screenshot from 2024-09-24 10-36-40

Screenshot from 2024-09-24 10-37-20

stefanscherzinger commented 1 month ago

Thanks @zirogravity

Any new insights?

edtriccorp commented 1 month ago

Hi @stefanscherzinger

Perhaps a single new insight but not yet fully understood on my end yet.

If I set the speed slider on the URSim/UR10e below 40% I can get the controllers to work painfully slow. This seems to be true no matter how small the error_scale is set to. I have the setup working with the teleoperation spacenav launch with cartesian force controller but NumPy versions break the Cartesian Pose launch. Was hoping to use the spacenav to log data with rosbag but I am still sorting through what topics are important to investigating the problem.

zirogravity commented 1 month ago

Hi @stefanscherzinger

Perhaps this video will help better describe the behavior I am seeing? https://youtu.be/P5t74CZsDwE

Also, if I am not mistaken moving the robot via Polyscope UI seems to cause the controller to lose track of the robot states. Reactivating external control works only if the robot is not moved via the Polyscope UI otherwise, the Cartesian controllers immediately fault the robot as seen in the video in between a Polyscope move and a replay of the external control program.

  1. The video shows spacenav manual control via the Cartesian Compliance Controller.
  2. The first part of the clip is Cartesian Compliance Controller launch the spacenav control motion. This only works if the speed slider is below 40%
  3. Next part of the video clip show speed slider increase above 40% then the fault
  4. Take the speed slider back below 40% and replay external control takes the robot back to spacenav functioning
  5. Had Polyscope been used to move the robot in between the fault getting Cartesian Controller to reengage leads to similar faulting indefinitely requiring a re-launch all over again.
zirogravity commented 4 weeks ago

Hi @stefanscherzinger

I am unsure if this is a related clue or a different issue somewhere else. With an active compliance controller this screen shot shows:

  1. ros2 topic echo /target_frame (CLI)
  2. /target_frame via RQT
  3. /target_frame via plotjuggler

RQT values do not update, plotjuggler shows added Euler angles?, ros2 cli mismatches with plotjuggler albeit corresponding pose and orientation values are correct i.e. ros2 topic echo does not show Euler angles but matches RQT.

Update:

I noticed "frame_id" is blank in the screen shot below. It seems like this "base_link" part of the header gets dropped maybe? Not sure why or when but I see that on a fresh start I get the frame_id populated with the base_link part in the header. When the frame ID is blank the controller is reporting something like " Received command in wrong frame expected "baselink but received ____"

Screenshot from 2024-10-29 17-03-30

LAYERED-pierrechass commented 3 weeks ago

EDIT: My bad, you can maybe ignore this, it was happening when using the Joint Based Controller from UR ROS Driver and restarting URSim solved the problem.

Hello, I observe the same error with my UR10e using the Compliance Controller with the ROS1 Noetic version. However, it doesn't seem to be link with a controller switching (I don't do any) but rather with the controller or the position of the robot.

It worked till few minutes ago. Hard to tell exactly what happened in between, I just jogged the robot around using URSim. I also observed a similar behavior on the issue mentioned above using a Pose based Cartesian controller, however I don't know if the Universal Robots ROS Driver uses your controller implementation or another one.

Let me know if you need more details or what I can do to help.

songwookim commented 1 week ago

@LAYERED-pierrechass Did you solved this problem ?? i got same error.. 'Scaledjointrajectory controller' in 'ur_ros2_controller' is worked but others are not worked.

Should i use a URSIM ?

# terminal 1
ros2 launch cartesian_controllers_universal_robots robot.launch.py  ur_type:=ur5e robot_ip:=192.168.1.1

# terminal 2
ros2 topic echo /geometry_msgs/msg/PoseStamped # get current_x,y,z and current_orientation

# terminal 3
ros2 control switch_controllers --activate cartesian_motion_controller --deactivate scaled_joint_trajectory_controller 

ros2 topic pub /target_frame geometry_msgs/msg/PoseStamped "{header: {frame_id: 'base_link'}, pose: {position: {x: current_x+0.1, y: current_y+0.1, z: current_z+0.1}, orientation: {x: 0.0, y: 0.0, z: 0.0, w: 1.0}}}"

My work environment with UR5e is below.

edtriccorp commented 1 week ago

Hello,

I do not know why I did not think of checking the full Polyscope logs earlier but today I did just that. These logs are from URSim using Polyscope 5.17 which is the same as we have on the robot and I have been witnessing consistent behavior between URSim and UR10e robot/hw.

******** Log start (2024-11-15 18:25:44) ********

3.5 :: 0000d00h00m00.000s :: 2024-11-15 18:25:48.844 :: -5 :: C0A0:0 :: null :: 1 :: 5.17.2 s/n: 20195299999 : UR10 ::  :: null
3.5 :: 0000d00h00m00.000s :: 2024-11-15 18:25:49.832 :: -5 :: C0A0:7 :: null :: 1 ::  :: Connected to Controller :: null
3.5 :: 0000d00h00m00.000s :: 2024-11-15 18:25:50.098 :: -2 :: C0A0:3 :: null :: 1 :: URControl 5.17.2.0 ::  :: null
3.5 :: 0000d00h00m00.000s :: 2024-11-15 18:25:50.098 :: -2 :: C0A0:12 :: null :: 1 :: 0.0.0: 0.0.0 ::  :: null
3.5 :: 0000d00h00m00.000s :: 2024-11-15 18:25:50.134 :: -5 :: C0A0:7 :: null :: 1 ::  :: Safety checksum changed to: 00C9B113 :: null
3.5 :: 0000d00h00m26.280s :: 2024-11-15 18:26:11.735 :: -3 :: C0A0:7 :: null :: 1 :: textmsg :: Program textmsg started :: null
3.5 :: 0000d00h00m26.284s :: 2024-11-15 18:26:11.735 :: -3 :: C0A0:0 :: null :: 1 :: urscript_interface connected ::  :: null
3.5 :: 0000d00h00m26.288s :: 2024-11-15 18:26:11.735 :: -3 :: C0A0:7 :: null :: 1 :: textmsg :: Program textmsg stopped :: null
3.5 :: 0000d00h00m35.637s :: 2024-11-15 18:26:21.410 :: -5 :: C0A0:7 :: null :: 1 ::  :: Program <unnamed> starting... (Unsaved) :: null
3.5 :: 0000d00h00m35.688s :: 2024-11-15 18:26:21.462 :: -3 :: C0A0:7 :: null :: 1 :: unnamed :: Program <unnamed> started :: null
3.5 :: 0000d00h00m35.692s :: 2024-11-15 18:26:21.463 :: -3 :: C0A0:0 :: null :: 1 :: ExternalControl: steptime=0.002 ::  :: null
3.5 :: 0000d00h00m35.698s :: 2024-11-15 18:26:21.463 :: -3 :: C0A0:0 :: null :: 1 :: ExternalControl: External control active ::  :: null
3.5 :: 0000d00h00m35.710s :: 2024-11-15 18:26:21.463 :: -3 :: C0A0:0 :: null :: 1 :: ExternalControl: Starting servo thread ::  :: null
3.5 :: 0000d00h00m56.882s :: 2024-11-15 18:26:43.395 :: -3 :: C0A0:0 :: null :: 1 :: ExternalControl: servo thread ended ::  :: null
3.5 :: 0000d00h01m04.858s :: 2024-11-15 18:26:51.610 :: -3 :: C0A0:0 :: null :: 1 :: ExternalControl: Starting servo thread ::  :: null
**3.5 :: 0000d00h04m39.524s :: 2024-11-15 18:30:33.254 :: -3 :: C0A0:0 :: null :: 1 :: Velocity 20.577 required to reach the received target [-1.594235, -1.715054, -2.231413, -0.802429, 1.595039, -0.024669] within 0.002 seconds is exceeding the joint velocity limits. Ignoring commands until a valid command is received. ::  :: null
3.5 :: 0000d00h04m39.524s :: 2024-11-15 18:30:33.259 :: -3 :: C174A2:6 :: null :: 2 ::  ::  :: 0
3.5 :: 0000d00h04m39.534s :: 2024-11-15 18:30:33.260 :: -3 :: C218A1:6 :: null :: 2 ::  ::  :: servoThread
3.5 :: 0000d00h04m39.978s :: 2024-11-15 18:30:33.754 :: -3 :: C0A0:0 :: null :: 1 :: Velocity 1079061265960517780674978211771199010412960408001592546096313184391289769924262625280 required to reach the received target [-2158122531921035460250156421727498790824614239679570066053334784255432735709986816, -40188274169231131068414796569196668 ::  :: null
3.5 :: 0000d00h04m40.192s :: 2024-11-15 18:30:33.960 :: -3 :: C0A0:10 :: null :: 1 :: string_length_too_long: required to reach t: ::  ::** 
3.5 :: 0000d00h04m40.196s :: 2024-11-15 18:30:33.964 :: -3 :: C0A0:7 :: null :: 1 :: unnamed :: Program <unnamed> stopped :: null

Please note the last several lines here where the received (computed) velocity/ target messages are unrealistic in values. Any reccomendations on a troubleshooting approach through the software stacks? i.e. Is this coming from the UR ROS2 Driver or the Cartesian controller code base?

EDIT: As I read through @LAYERED-pierrechass link above to the ROS1 git issues and the possible relationship between the two I am also wondering if it is possible for this to be rooted in the urcap/external control or the ur controller code base itself? Help isolating which part of the code I should be setting eyes on would be appreciated.

It appears as if switching controllers does not handle initial conditions perhaps? I would imagine no command should be sent or lingering in the command queue during a controller switch no? For example why does the controller try to command to [-1.594235, -1.715054, -2.231413, -0.802429, 1.595039, -0.024669] when nothing has asked it to go there?

edtriccorp commented 1 week ago

Hello all,

Just attempting to share or document more findings here. This is a link to a screen cast that shows moving the speed slider seems to induce some kind of command. In the video I show polyscope along side plotjuggler for pose variables. The jump seems to be related to PI but I would expect that moving the speed slider should not cause any command or motion.

edtriccorp commented 4 days ago

Hello @stefanscherzinger

Please close this issue and please consider the following. Mainly on the UR ROS2 Driver and ROS2 Control "user experience" side of the house for error reporting and logging.

  1. This issue appears to be related to launching without explicitly defining the proper update_rate for the robot for the UR ROS2 driver. From what I am gathering ros2_control defaults to 100Hz update rate.
  2. Unfortunately, the error messages and ROS2 logs as well as info windows by Polyscope, UR ROS2 driver, and ros2_control are not very informative or intuitive at least for someone just entering into ROS2 workflows. The primary error messages witnessed are summarized as follows:

Polyscope:

Velocity 1079061265960517780674978211771199010412960408001592546096313184391289769924262625280 required to reach the received target [-2158122531921035460250156421727498790824614239679570066053334784255432735709986816, -40188274169231131068414796569196668

UR ROS2 / ROS2 Driver [ERROR] [1732227629.977068668] [tolerances]: State tolerances failed for joint 3: [ERROR] [1732227629.977091708] [tolerances]: Position Error: 0.203323, Position Tolerance: 0.200000

None of these errors suggest that such behavior is coming from a "default" update rate. In my case I setup the launch and config files to point to the latest ones from UR ROS2 drivers. Compared to referencing the minimal UR ROS2 example linked on this repo. The minimal example does not emphasize the update rate in which my case it needed to be 500Hz rather than the default 100Hz.