ros-industrial / motoman

ROS-Industrial Motoman support (http://wiki.ros.org/motoman)
145 stars 192 forks source link

Controller is hanging before reaching trajectory goal #245

Closed ghost closed 5 years ago

ghost commented 5 years ago

Hi, i'm trying to control a MH24 with the DX200 controller over the action server(with moveit). Basically it works, but from time to time I get the error: Trajectory start position doesn't match current robot position (3011). I was looking a lot for a solution but I couldn't fix the problem yet. When I disable the _execution_durationmonitoring I can determine that before I get the error, the execution of the plan never stops and no result gets published by the controller. The status of the _joint_trajectoryaction remains at 1. I get the error with both the moveit gui and the python _moveitcommander. I also tried to use the boneil point streaming but I wasn't able to build that package. I don't know what else I can do to fix this problem. Is this a known bug or are there any known fixes or workarounds? Thanks in advance

gavanderhoorn commented 5 years ago

Basically it works, but from time to time I get the error: Trajectory start position doesn't match current robot position (3011).

This means that your trajectories don't start at the current pose of the robot. The driver is very sensitive to this as it uses incremental motions, not absolute ones. See #219 for some discussion. Summarising: check the in_motion flag from the RobotStatus msg to determine whether you can send new trajectories and make sure to use the JointState at that time to start planning / generate new trajectories from.

It could also be #111, but that would depend on how you created your MH24 model. If you can make it available I can check it for you (but probably not today).

When I disable the execution_duration_monitoring

FYI: the trajectory execution monitor is part of MoveIt, not the driver.

I can determine that before I get the error, the execution of the plan never stops and no result gets published by the controller. The status of the _joint_trajectoryaction remains at 1.

This depends a bit on which joint_trajectory_action topic you're using, but see #226.

I don't know what else I can do to fix this problem.

If your trajectories do start at the current state and your urdf is correct, we'd need to start debugging. If they don't (or you're not sure), then that would be the first thing to address.

ghost commented 5 years ago

Thank you for the fast response. Sorry I forgot to mention that the first trajectory point is matching the current state exactly. I'm using this robot model

gavanderhoorn commented 5 years ago

I'm using the robot descripiton from: https://github.com/fizyr/yaskawa_mh24_support

Then you might be running into #111, as the joints don't have a numeric part (see here fi).

Could your try adding a _1, _2 etc to the joint names and see if that makes things work? Like this:

https://github.com/ros-industrial/motoman/blob/2beea06438bb57ba4e5e334c2b9fb2178a71c239/motoman_sia10f_support/urdf/sia10f_macro.xacro#L136

It's an ugly work-around, but might be necessary for now.

ghost commented 5 years ago

I tried the workaround but I still have the problem that the controller doesn't reach the goal(The joint names are now in the right order). What are the meanings of the status values i the action server status topic ?

ghost commented 5 years ago

No ideas what else I could do to fix this problem?

gavanderhoorn commented 5 years ago

Plenty, but I've been quite busy.

gavanderhoorn commented 5 years ago

Let's start from the beginning: can you please make a wireshark capture of the traffic between your ROS pc and the controller.

Start everything as normal, then try to get the driver to command some motion. Any motion will do.

If things are still broken, it shouldn't work. Edit: to clarify: we need a capture from where it doesn't work.

Save the capture, zip it and attach it here.

Second: I'm going to need to see how you configured everything. If you're using a MoveIt configuration, please make that available. If you're not, and you're using the controller_joint_names or topic_list parameters only, please show us that. If you're doing something else, please explain / show that.

Third: show us how you use the action server. Did you write a client yourself, did you copy it from somewhere, etc?

Fourth: please detail your ROS version, controller software version, which version of MotoROS you installed, how you installed it and how you installed ROS (from sources, or .debs).

Fifth: show the output of rostopic list after you've started the driver.

ghost commented 5 years ago

Many thanks for your great support. I will do that tomorrow.

EricMarcil commented 5 years ago

Do you have a FSU (Functional Safety Unit) on that robot? And are you using the speed limit function? We have recently realized that when the FSU limits the speed of the robot, it reduces the motion command received by ROSi but there is no feedback about it and then the robot doesn't reaches its expected position.

gavanderhoorn commented 5 years ago

Do you have a FSU (Functional Safety Unit) on that robot? And are you using the speed limit function? We have recently realized that when the FSU limits the speed of the robot, it reduces the motion command received by ROSi but there is no feedback about it and then the robot doesn't reaches its expected position.

@EricMarcil: should we ticket that as an issue? Doing that could help other users when they search for the symptoms described in it.

EricMarcil commented 5 years ago

@gavanderhoorn: Yes, we probably should so people are aware of it. At this time, we don't have the tool (Motoplus API) to fix it. We made a request to Japan to get feedback about the FSU but it is a complexe problem to address since the FSU is an independant safety module. For now, the people need to make sure that they command the robot a speed below the FSU settings.

gavanderhoorn commented 5 years ago

See #247.

ghost commented 5 years ago

To 1: wireshark_capture_motoman_problem.pcapng.gz hopefully recorded correctly (my first use of wireshark) To 2: mh24_moveit_config.zip To 3: I use the action server via _movegroup with the FollowJointTrajectory controller (located in _mh24_moveitconfig/config/controllers.yaml) To 4: ROS version: kinetic over apt; controller version: 2.03(displayed while booting, I hope this is the version you want to know); MotoRos version: 1.8.1, as installed in the tutorial; Linux distribution: Ubunutu 16.04 (Gnome Desktop) To 5: topic_list.txt after starting the _robot_interfacestreaming

Regarding the FSU: I don't know if have a FSU, how can i check that? I couldn't find something in the SAFETY FUNC. menu on the control panel

EricMarcil commented 5 years ago

If you don't have a SAFETY FUNC. menu on the programming pendant then you don't have the FSU on your controller. So it eliminates that possibility.

ghost commented 5 years ago

I have a SAFETY FUNC. menu, but no submenu named FSU or something like that.

gavanderhoorn commented 5 years ago

I've taken a look at the wireshark capture and noticed that MotoROS does send a INVALID result code with a Data Start Pos (3011) sub code upon receiving the first trajectory point for the third motion (around pkt nr 4928 in the capture).

The ROS part of the driver is not involved at that point.

The immediately preceeding JOINT_FEEDBACK msgs do seem to contain the same joint positions as those in the rejected trajectory point (first Jn is from the JOINT_FEEDBACK msg, second one is from the rejected traj pt msg):

 J0:   -0.223770827
 J0:   -0.223770827

 J1:   -0.704325914
 J1:   -0.704325914

 J2:   -0.700338900
 J2:   -0.700338900

 J3:   -1.564421535
 J3:   -1.564421535

 J4:   -1.743251204
 J4:   -1.743251204

 J5:    1.049106359
 J5:    1.049106359

I'm not sure why MotoROS is giving you the error at this point.

@ted-miller @EricMarcil any ideas?

controller version: 2.03(displayed while booting, I hope this is the version you want to know);

I was actually after the Controller system software version, as mentioned on the wiki page.

EricMarcil commented 5 years ago

This is strange because if you look at the Status message it is also indicating that "In Motion: True" but the feedback position is not changing. It stopped changing at line 3480 but the "In Motion" flag remains on all the way till the end. They are both based on similar calculation.

I believe that when the 3011 error occurs, the values used for the calculation are printed. Can you telnet into the controller and check for the values reported by the MotoRos driver? You can refer to section 5.4 of the following manual for the procedure: https://www.motoman.com/hubfs/downloads/documentation/169286-1CD.pdf?hsLang=en-us

gavanderhoorn commented 5 years ago

I agree it's strange.

Could this be an unintended side-effect of #227?

@hufman4: could you please try MotoROS v1.8.0? You can download it here.

gavanderhoorn commented 5 years ago

@hufman4: ?

ghost commented 5 years ago

Sorry for the late response, I wasn't able to work on the problem the last days. I will test it tomorrow.

ghost commented 5 years ago

Same problem with MotoROS v1.8.0 . My controller version is DN2.42.00A(US/DE)-00 .

Can you telnet into the controller and check for the values reported by the MotoRos driver? You can refer to section 5.4 of the following manual for the procedure: https://www.motoman.com/hubfs/downloads/documentation/169286-1CD.pdf?hsLang=en-us

I will try that now.

ghost commented 5 years ago

This is the output I got from the MotoROS driver via telnet after i aborted the hanging trajectory and started a new one with the moveit GUI. The error occured after i started the new trajectory.

Speed Feedback registers enabled (OK).
axisType[0]: Rot        Rot     Rot     Rot     Rot     Rot     ---     ---     ;
pulse->unit[0]: 76855.4141      109301.6719     91265.8125      58605.6875      56497.8828   36170.7852       --      --      ;
maxInc[0] (in motoman joint order): 1057, 1449, 1338, 1677, 1617, 1565, 0
maxSpeed[0] (in ros joint order): 3.438274, 3.314222, 3.665118, 7.153742, 7.155135, 10.816741, 0.000000
Controller connection server running
Starting new connection to the State Server
Starting State Server Send State task
Controller number of group = 1
Starting new connection to the Motion Server
Creating new task: IncMoveTask
IncMoveTask Started
Creating new task: tidAddToIncQueue (groupNo = 0)
Creating new task: tidMotionConnections (connectionIndex = 0)
In StartTrajMode
Robot job is ready for ROS commands.
ERROR: Trajectory start position doesn't match current position (MOTO joint order).
 - Requested start: -26382, -60354, -47650, -94302, -112383, 51338, 0, 0
 - Current pos: -26383, -60354, -47651, -94323, -112383, 51339, 0, 0
 - ctrlGroup->prevPulsePos: -26383, -60354, -47651, -94323, -112383, 51339, 0, 0 
EricMarcil commented 5 years ago

There is a 21 pulses difference on the R-axis. Which explains the error, since the default limit is 10 pulses. I'm also working with another customer that is having a similar problem with a large size robot. We are thinking that because the robot Tool properties are defined, the control loop is not optimal and it is preventing the robot to reach its final position.
Can you make sure that TOOL 0 of the robot is defined, including the center of gravity and inertia. There is a function to help you estimate the tool load. I'm attaching a document with the instruction. Tool load automatic measurement.pdf

If that doesn't work, we might need to increase the START_MAX_PULSE_DEVIATION value for larger robots.

EricMarcil commented 5 years ago

Sorry, correction on the last post:

We are thinking that because the robot Tool properties are defined, the control loop is not optimal and it is preventing the robot to reach its final position.

That should have been: We are thinking that because the robot Tool properties are NOT defined, the control loop is not optimal and it is preventing the robot to reach its final position.

ghost commented 5 years ago

I have completed the calibration, but the problem is still there.

-> ERROR: Trajectory start position doesn't match current position (MOTO joint order).
 - Requested start: -26377, -3191, -17956, 21, 9455, 2, 0, 0
 - Current pos: -26377, -3192, -17956, 1, 9455, 3, 0, 0
 - ctrlGroup->prevPulsePos: -26377, -3192, -17956, 1, 9455, 3, 0, 0

I've now tried to increase the START_MAX_PULSE_DEVIATION by setting the value in the Conntroller.h file to 25, but i still have the problem, even though the diviation via telnet is below 25. Do I have to manually compile the code in addition catkin_make? Sorry, I don't have much C++ knowledge.

EricMarcil commented 5 years ago

The START_MAX_PULSE_DEVIATION is part of the MotoROS driver running on the controller. You need to purchase the MotoPlus SDK in order to change and recompile the application. I'll see if I can make a copy for you as a temporary solution. But I would like to get more information to better understand what is going on.
Can you please reproduce the problem and when it occurs, on the controller pendant, set the security level to management, then select from the menu: ROBOT --> SERVO MONITOR. You should see a column indicating FEEDBACK ERROR. I would also like to have a copy of your ALL.PRM, TOOL.DAT, SYSTEM.SYS.

Please e-mail my work address directly. I don't want to post temporary version or data specific to your system directly on this message thread. eric.marcil@motoman.com

gavanderhoorn commented 5 years ago

@EricMarcil: did you ever get to the bottom of this?

EricMarcil commented 5 years ago

I have received the data from @boandlgrama but could not confirm the source of the issue from it. I've also had another customer with a similar issue. We are thinking that it might be related to the robot larger size. But I don't have any large robot at my location to run tests. I'll try to test it when I go to our main office or maybe try to get someone run the tests for me. In the mean time, we've supplied a temporary version of the driver where we've increase the START_MAX_PULSE_DEVIATION to 30. But until I know more, I don't want to make that a permenant change.

gavanderhoorn commented 5 years ago

In the mean time, we've supplied a temporary version of the driver where we've increase the START_MAX_PULSE_DEVIATION to 30. But until I know more, I don't want to make that a permenant change.

No, indeed.

gavanderhoorn commented 5 years ago

256 seems to increase START_MAX_PULSE_DEVIATION (because of the issue reported here?).

ted-miller commented 5 years ago

256 seems to increase START_MAX_PULSE_DEVIATION (because of the issue reported here?).

Yes, there have been multiple reports of this behavior. I haven't been able to identify a root cause. But, I feel that a minor increase to START_MAX_PULSE_DEVIATION isn't going to hurt anything.

gavanderhoorn commented 5 years ago

Going to close this as it is either fixed / worked-around with the merge of #256, or is still an issue but there has not been any activity the past 8 months.

If we get more reports we can always re-open in the future.