pick_ik ignores joint limits

SammyRamone commented 1 year ago

The pick_ik provides solutions which are not respecting the joint limits. Of course, it is possible to provide a custom function to the setFromIK() call which checks if the solution is valid (e.g. by calling thestatisfiesBounds() method on the robot state). However, I think that pick_ik should do it automatically. Other MoveIt IK solvers that I used before only provided solutions which satisfy the joint limits. Therefore, it was quiet unexpected behavior which lead to some confusion. I know that there is also the avoid_joint_limits_weight which can be used, but this generally avoids the limits and does not only do a binary check if the solution is inside the limits.

To show the difference to the behavior of other IK solvers, I will provide a small example from the demo of the reach_ros2 package. The goal of this demo is to check if the robot can reach each of the shown markers. This is the solution when using pick_ik without futher checks. It will provide some solutions which go over joint limits. pick_ik_nocheck

If a manual check for joint limits is introduced, it will not find a solution for this point (shown by a black marker and the arm being in zero pose). pick_ik_check

The same behavior is also done by the KDL plugin. kdl

As well as by the original bio_ik plugin. bio_ik

I think it would make sense to follow the same behavior as other MoveIt IK solvers and I also don't see any benefit of getting IK solutions which are not respecting the joint limits. Additionally, the current behavior also leads to the MoveIt demo.launch being not so useful, as one can not so simply add an additional check in there.

sea-bass commented 1 year ago

I agree. If pick_ik is not respecting joint limits, that's a bug and certainly not intended behavior.

Both the local solver and global solver are supposed to clamp to joint limits, so either there is a bug in this logic or in parsing the joint limits from the robot model.

I'll take a look when I can, but if you grabbed pick_ik from source and have the ability to stick some print statements to confirm that the bounds are in accordance with the motoman URDF in the reach_ros2 repo, I'd appreciate that.

Unrelated, I was just talking to my colleagues about the reach_ros2 benchmarks and we might soon start collecting benchmark data to compare the solvers quantitatively. Glad to see that it's as easy as swapping out the kinematics.yaml file.

SammyRamone commented 1 year ago

Unrelated, I was just talking to my colleagues about the Reach benchmarks and we might soon start collecting benchmark data to compare the solvers quantitatively. Glad to see that it's as easy as swapping out the kinematics.yaml file.

It is very simple to change the IK used in REACH as it just relies on the normal MoveIt interface. There is just one issue that you might encounter when using bio_ik: REACH creates multiple threads with multiple IK instances and this leads to an issue (segmentation faults and bad memory access) in bio_ik. However, I am not sure if this is a general bio_ik issue or just one of the fork that I'm using. Anyway, if you also encounter this, you can get around it by setting the parameter max_threads: 1 in the REACH study configuration. KDL and pick_ik worked fine with multiple threads.

The idea of such a benchmark is great. The reason I encountered this bug was that I was testing out these three different IKs manually for the REACH demo object and a more complex object which is my actual use case. I would be interested to see how they perform in comparision to each other on a larger set of objects.

sea-bass commented 1 year ago

So, pick_ik is actually a very similar algorithm to bio_ik, but rewritten to be easier to read/work with, and in doing so we ensured to make it thread-safe. Our official PickNik standpoint now is that we're working to establish feature parity so we can eventually phase out bio_ik, but as you've found, pick_ik is a relatively new solver and we're still ironing out the kinks.

Robotawi commented 1 year ago

Thanks for reporting this issue, @SammyRamone! I have successfully reproduced the issue using reach_ros2 and pick_ik as a solver. I will check the source code where the limits checking is performed to find the cause.

Here is a pose in which joint_l and joint_b were violated. Both have a range from -1.9198 to 1.9198. violated joints screenshot

Here are the out-of-range joint values violated joints values screenshot

sea-bass commented 1 year ago

@Robotawi thanks for looking into this. I have a few tips for where to look for the bug:

In pick_ik's robot model parsing, put some breakpoints/debug statements to see if the joint limits given to the joint as correct. If it's wrong here, then we can localize the bug to that.
Otherwise, try setting the mode parameter of the solver to local first. If you can recreate the issue above just by using the local solver, then the issue is somewhere here.
Lastly, if local solver seems to stick to the joint limits, you can look at the global solver code.

Robotawi commented 1 year ago

@sea-bass, I want to share an update on my progress in debugging the issue of out-of-range joint values. Although I haven't fully identified the cause yet, I have taken some steps to isolate the problem and would appreciate any further insights. I am using this branch for debugging.

(1) I verified the parsing step and the joint limits are correctly parsed. They match the joints in the arm's xacro file.

(2) I switched the optimizer to local mode and began examining the gradient descent-related parts.

(3) I looked into the applying optimization step and observed that the calculated new joint value, given by self.local[i] - self.gradient[i] * joint_diff can exceed the joint limits. These cases are clamped to the clip_min or clip_max limits. Although this shows the joint values stay within the allowable range, I do not think it is desirable to call the cost functions with those clamped min/max values. Please correct me if I am wrong.

(4) I checked the process of computing the gradient direction, which involves checking the negative and positive displacements. I found that in some cases, the gradient can be computed with values that are out-of-range. I mean that self.working[i] = self.local[i] - step_size and self.working[i] = self.local[i] + step_size can actually go beyond the joint ranges. This results in p1 = cost_fn(self.working) and p3 = cost_fn(self.working) be calculated based on out-of-range values. This suggests clamping the self.working joint values and decreasing the step_size parameter. I clamped the joints in the code, and decreased the step_size in the kinematics.yaml file, but the issue of out-of-range solutions persisted. Should we consider adjustments to the cost function or further investigate the gradient computation?

(5) To validate the initial_guess joint values, I modified the code to pass a reference to a robot object to the GradientIk::from function. I noticed that sometimes the initial guess joint values exactly matches the clip_min or clip_max values without exceeding these limits. Approximately, this happens 50 times per 1000 calls. I do not think this is the intended behavior. This will probably slow convergence. Does this suggest any additional area to debug or something to modify?

(6) I have also checked the active positions in the make_is_solution_test_fn function before it returns true indicating an acceptable solution and did not catch the out-of-range joints. I set it to return false when a value equal to the joint limits is checked, but did not eliminate the problem.

(7) Finally, I tightened the position_threshold, orientation_threshold, and cost_threshold values. The number of out-of-range solutions decreased but was not zero.

I want also to confirm that I tried with KDL IK solver, and it did not result in any out-of-range joint values, so it is for sure something on our side.

Sorry, I could not find exactly where the issue is, but I look forward to any guidance you might have for resolving this issue. I have not understood all of the details yet, but I will continue trying to comprehend more and find the root cause. Feel free to suggest to me where to look or re-visit any parts. Thank you!

sea-bass commented 1 year ago

Thanks for this detailed analysis, @Robotawi !

I think there are some good improvements to be made here based on this, especially your items 3 and 4. We probably want to have a really large cost function if the step size kicks the robot into an out-of-range joint state, or some other way to tell the solver this isn't feasible.

I wanted to maybe check if https://github.com/PickNikRobotics/pick_ik/pull/48 could affect this? What happens if you include the changes in here?

sea-bass commented 1 year ago

You could also set a nonzero avoid joint limits cost to see if that helps?

Robotawi commented 1 year ago

With pleasure, @sea-bass. I will keep trying till we figure it out. I checked the cost functions implemented in goal.cpp and want to add a cost function for your mentioned purpose. However, I need clarification about how to incorporate a function that operates from inside the gradient_descent calculation. Your guidance for the implementation would be greatly appreciated.

FYI, the out-of-range joints issue persists despite PR #48, and setting the avoid_joint_limits_weight causes the IK solution to fail. In terms of the reach study, the reachability becomes zero. Here is a sample of the output.

[reach_ros_node-1] Reach study complete
[reach_ros_node-1] Starting optimization
[reach_ros_node-1] Entering optimization loop 0
[reach_ros_node-1] 
[reach_ros_node-1] 0%   10   20   30   40   50   60   70   80   90   100%
[reach_ros_node-1] |----|----|----|----|----|----|----|----|----|----|
[reach_ros_node-1] ***************************************************
[reach_ros_node-1] ------------------------------------------------
[reach_ros_node-1] Percent Reached = 0
[reach_ros_node-1] Total points score = 0
[reach_ros_node-1] Normalized total points score = -nan

This is the kinematics.yaml config file with the avoid_joint_limits_weight param, which was used when I got the previous result.

manipulator:
  kinematics_solver: pick_ik/PickIkPlugin
  kinematics_solver_timeout: 0.05
  kinematics_solver_attempts: 3
  mode: global
  position_scale: 1.0
  rotation_scale: 0.5
  position_threshold: 0.001
  orientation_threshold: 0.01
  avoid_joint_limits_weight: 0.001
  cost_threshold: 0.001
  minimal_displacement_weight: 0.0
  gd_step_size: 0.0001

I have a question. When the SolutionTestFn returns true, which means frame_tests and goals are satisfied, where after this check should I focus to see what may be causing the issue?

sea-bass commented 1 year ago

@Robotawi I think for now we could just modify ik_gradient.cpp in places like this to check if you take a step beyond a joint limit, you return some large value (like 100.0) instead of cost_fn(self.working).

sea-bass commented 1 year ago

OK yeah, I'm not able to reproduce the problem. On the latest branch I get no out of bounds solutions...

I get that there is a "dead spot" in the center of the mesh that the robot isn't supposed to reach... but the joint values look fine. Am I missing something?

sea-bass commented 1 year ago

WOW I think I found the issue... pick_ik alphabetizes the joint names, so the limits are out of order because this particular model uses letters instead of numbers for the joint names.

If I hard-code the order of the joints, the dead spot correctly comes up...

After digging, it has to do with the use of a std::set in the robot.cpp file: https://github.com/PickNikRobotics/pick_ik/blob/main/src/robot.cpp#L102

sea-bass commented 1 year ago

Can you guys try to test with this PR branch? https://github.com/PickNikRobotics/pick_ik/pull/54

Robotawi commented 1 year ago

The change fixes the out-of-bounds joint issue. Thank you, @sea-bass! :+1: :tada:

PickNikRobotics / pick_ik

pick_ik ignores joint limits #43