ros-industrial / industrial_core

ROS-Industrial core communication packages (http://wiki.ros.org/industrial_core)
153 stars 180 forks source link

Sort-of rebase of 83: traj action aborts on error #271

Open gavanderhoorn opened 3 years ago

gavanderhoorn commented 3 years ago

Sort-of, as it's actually a bit more than what was included in #83.

The proposed changes make the joint_trajectory_action aware of the state of the (OEM) motion server program (as far as it publishes that in the form of RobotStatus messages). In two cases is this information taken into account:

  1. while accepting new goals
  2. while executing already accepted goals

In both cases, the following is monitored / checked:

  1. is an e-stop active?
  2. is there some other error active?
  3. does the motion server report motion_possible is true?

If any of these are not as expected, the goal is rejected, or, if a goal is currently being processed (ie: motion is execution), the goal is aborted and trajectory execution cancelled (as that should now work, with #260 merged (which still requires a cooperating driver of course)).

In all cases, the action result would be INVALID_GOAL, as unfortunately the FollowJointTrajectory action message does not define any really more suitable error codes.

Also in all cases: the reason for aborting or rejecting a goal is described in a human readable message.

When rejecting goals:

[ERROR] [...]: Rejecting goal: controller reported motion not possible (no further information)
[ERROR] [...]: Rejecting goal: controller reported (active) error (OEM code: 11003)
[ERROR] [...]: Rejecting goal: controller reported e-stop

When aborting goals:

[ERROR] [...]: Aborting goal: controller reported motion not possible (no further information)
[ERROR] [...]: Aborting goal: controller reported (active) error (OEM code: 11003)
[ERROR] [...]: Aborting goal: controller reported e-stop

(this is output from fanuc_driver: 11003 is Deadman switch released, which will indeed prevent any motion in manual mode)

While this is certainly not perfect (it doesn't address #265 nor #118), together with #263 and #260 and an (OEM) relay which behaves according to the spec, the UX of industrial_robot_client is improved quite a bit.

Lastly: as previous versions of the JTA did not check anything, misbehaving drivers would still be able to execute motions, even if they didn't properly populate all fields of the RobotStatus messages. In order to still allow usage of such systems, two new parameters were added which can be used to revert the JTA back to its previous behaviour (ie: don't check anything, always forward goals to drivers).

In the current implementation, the default is to disable these new checks. Something to discuss is whether we should make the default to enable the new checks, and users with misbehaving drivers should instead be required to explicitly enable the bw-compatible behaviour.

gavanderhoorn commented 3 years ago

Note: 55034f5 is basically what #83 proposed, but it also actually signals to the relay it should stop sending points.

Subsequent commits rewrite this into what the current implementation does.

gavanderhoorn commented 3 years ago

I've requested @Levi-Armstrong and/or @JeremyZoss review this, as they've been involved with the development of industrial_core in the past.

@simonschmeisser: if you're still using this, and have some time, I'd appreciate your opinion as well.

Note: I do not intent to keep developing all of this. I just wanted to make sure I can get #83 in before releasing a new version of industrial_core into Melodic and Noetic. But since #83 actually only did half of what it was supposed to, I opted to flesh it out a bit more. And with #260 in place, it was actually not that difficult.

Note 2: it's still possible for trajectory relays to keep moving after an e-stop or error state is cleared: there is no way for the ROS side here (so the IRC node) to enforce anything. The streamer will now in all cases send a "traj abort" message, but if the trajectory and state relays (running on the OEM controller) do not cooperate, there isn't much we can do.

fanuc_driver fi will continue to execute the last sent point after resuming the relay program. Then it will stop. fanuc_driver_exp is slightly better, in that it purges its internal motion buffer and actually SKIPs the current segment. Other drivers may behave similarly (abb_driver fi).

gavanderhoorn commented 3 years ago

This probably also addresses #131 partly, as this should not be possible any more (the JTA should reject the goal due to the controller not being in AUTO mode, or at least not reporting that motion_possible is true):

This happens for me ie when the controller is set to "Manual mode" instead of "automatic". The mitsubishi controller will allow for upload but returns an error once I try to start execution of the trajectory.

Of course it will depend on the (OEM) state program to properly report controller state (ie: motion_possible should accurately reflect whether motion is possible or not).

gavanderhoorn commented 3 years ago

Friendly ping.

I don't believe this needs extensive testing.

The main question is whether we want to change the default behaviour (from not checking to checking).

gavanderhoorn commented 3 years ago

@simonschmeisser: thanks for the review.

Would you see an opportunity to test this on any real hw?

I'd be interested to know whether this mitigates what you reported in #131 for instance.

simonschmeisser commented 3 years ago

I'll try to fit in testing that on our Mitsubishi in the evening

gavanderhoorn commented 3 years ago

I actually don't have access to an downloading driver / robot, so I'd be interested to know whether the proposed changes actually work there.

If they do, we'll need to update the .launch file for download drivers as well.

simonschmeisser commented 3 years ago

Thanks for pointing that out, I actually only have access to a downloading robot/driver

gavanderhoorn commented 3 years ago

Thanks for pointing that out, I actually only have access to a downloading robot/driver

I expect the changes to accepting goals to work for both. I believe that should help with #131.

Aborting active goals will work, but whether the OEM server program also cooperates I can't say -- and will depend on its implementation of course.

gavanderhoorn commented 3 years ago

Ok, so I've changed things around a bit:

We can still decide to set consider_status_unknowns_ok to true by default.

We could also say that the default should be false, and drivers which can't function that way should explicitly override the default arg and set it to true.

gavanderhoorn commented 3 years ago

@simonschmeisser: any luck? :)

gavanderhoorn commented 3 years ago

@simonschmeisser: very friendly ping.

simonschmeisser commented 3 years ago

Sorry, I'm on the road (well tracks to be exact ...) but will try to give this some time this Friday

gavanderhoorn commented 3 years ago

@simonschmeisser: friendly reminder.

(only doing this as I tend to forget when I sort-of promise someone I would do something three days later while sitting in a train ;) )

simonschmeisser commented 3 years ago

The immediate feedback when being in manual mode/e-stop/drives not powered is quite nice.

[ERROR] [1625839551.379974859]: Rejecting goal: controller reported motion not possible (no further information)
[ WARN] [1625839551.380040937]: Controller  failed with error INVALID_GOAL: 
[ WARN] [1625839551.380065314]: Controller handle  reports status FAILED
[ INFO] [1625839551.380084931]: Completed trajectory execution with status FAILED ...
[ERROR] [1625839551.380121015]: Error executing plan:  "FAILED"

I wonder if this could or should be more specific?

Edit: e-stop is printed out correctly:

[ERROR] [1625840349.974565108]: Rejecting goal: controller reported e-stop

Otherwise I could not spot any regression with our mitsubishi downloading driver

gavanderhoorn commented 3 years ago

I wonder if this could or should be more specific?

well, we could inspect fields like drives_powered etc, but a) this was already enough work to implement and test (for a package basically in maintenance mode) and b) it gets quite confusing with drivers publishing UNKNOWNs.

Easy enough to do though: update describeRobotStatusMsg(..) in industrial_robot_client/src/joint_trajectory_action.cpp (here).

Right now it singles out e-stops, then checks for any other errors (with potentially an OEM code) and finally it just checks motion_possible.

I'm not sure we can say for all supported systems that fi drives_powered is a requirement for motion_possible -- some robots have an "eco mode", which powers down the drives when they are not used for some time (even with a program running), and will automatically enable them again when needed.

These sort of things made me hesitant to check anything more than motion_possible, as at least for that field, we can rely on the "insight" and understanding of the driver developer to figure out whether motion would be possible and report it accordingly.

I would probably recommend applications (ie: other nodes) to subscribe to robot_status and inspect the messages if further information could help determine a course of action. The IRC doesn't have any way to communicate it, as we always only use INVALID_GOAL.

gavanderhoorn commented 3 years ago

Just noticed this in your comment:

[ WARN] [1625839551.380040937]: Controller  failed with error INVALID_GOAL: 

I would have expected the message passed to setRejected(..) to show up after INVALID_GOAL:, but that doesn't appear to be the case.

To make the formatting for empty controller names somewhat easier to read, I've submitted https://github.com/ros-planning/moveit/pull/2761.

simonschmeisser commented 3 years ago

Yes, fully agree.

One thing I was worried about is that this PR would reject goals in manual mode but it doesn't and that's a good decision as well. We show a confirmation dialog when in manual mode or anything else considered erroneous in our software and it can get annoying during initial bring up. So it's good that we don't need to change parameters to allow movements in manual mode.

gavanderhoorn commented 3 years ago

One thing I was worried about is that this PR would reject goals in manual mode but it doesn't and that's a good decision as well.

well, I didn't do anything special for that. It's all part of "only look at motion_possible".

If the driver reports motion should be possible, we're not going to doubt that by looking at other fields and then trying to be clever.

gavanderhoorn commented 3 years ago

What we could do, is only include drives_powered et al. in the description (ie: the error message), but not in the check.

But I'm not sure that would work either, as it may lead to the wrong conclusion on the human side when they read the error message.

simonschmeisser commented 3 years ago

We would need to set the error_text string field of the FollowJointTrajectory msg for it to be printed there

I'm not sure why we have the empty controller name in all of our controller files ... what do you usually put there? Something like mitsubishi driver?

gavanderhoorn commented 3 years ago

We would need to set the error_text string field of the FollowJointTrajectory msg for it to be printed there

According to the api doc (and the sources), anything you pass to setRejected(..) gets assigned to the error_text. At least that's how I interpreted that.

I'm not sure why we have the empty controller name in all of our controller files ... what do you usually put there? Something like mitsubishi driver?

It's most likely empty because the template in the tutorial lots of people refer to has it set to "".

If you set it to something non-empty, MoveIt will start looking in a different place for your action topics. That's not necessarily a problem though, you'd just have to update your driver to offer the action server there (or remap).

simonschmeisser commented 3 years ago

According to the api doc (and the sources), anything you pass to setRejected(..) gets assigned to the error_text. At least that's how I interpreted that.

But we have two text variables for the error here. The generic one you are setting and the redundant one in http://docs.ros.org/en/api/control_msgs/html/action/FollowJointTrajectory.html

gavanderhoorn commented 3 years ago

O, crap, you're right.

I'd somehow confused those.

Perhaps MoveIt should check both and display them.

gavanderhoorn commented 3 years ago

So with the technical side out of the way, the question becomes: do we want to merge this into melodic-devel and change the default behaviour?

@simonschmeisser @Levi-Armstrong @anyone-with-an-opinion-really?

gavanderhoorn commented 3 years ago

With 0eeb23f0ff387a69deecaab66e9106e2f7af3228 and 0fa8bba32bbc1fc2a2e0bc140119ac4ab6e33817, MoveIt now reports something like:

[ WARN] [1626612647.088687582]: Controller '' failed with error INVALID_GOAL: Rejecting goal: controller reported motion not possible (no further information)
[ WARN] [1626612647.088809223]: Controller handle '' reports status FAILED
[ INFO] [1626612647.088865293]: Completed trajectory execution with status FAILED ...
[ INFO] [1626612647.088960767]: Execution completed: FAILED

The double controller there can be a bit confusing.

I was thinking of maybe adding "OEM" or "robot" there, so we would get:

[ WARN] [1626612647.088687582]: Controller '' failed with error INVALID_GOAL: Rejecting goal: OEM controller reported motion not possible (no further information)

But this would of course not be accurate if the server side would not run on an OEM controller.

simonschmeisser commented 3 years ago

I'm not sure how many people understand what you mean by OEM so I'd rather go for robot controller

gavanderhoorn commented 2 years ago

Ok, to move this forward, I'm going to create a "post noetic" branch.

On that branch, everything discussed here will default to false (ie: ignore_motion_server_error and consider_status_unknowns_ok).

On both Melodic and Noetic, they'll default to true (to maintain the current behaviour).

People still using industrial_robot_client and likely to keep use it could build from source. The binary releases will just play-nice.

simonschmeisser commented 1 year ago

Friendly ping, could be part of ros-o