Any difference between of v1.0 and v0.3?

kaixindelele commented 4 years ago

Hi, it is very odd, when I test Sawyer-Joint-Velocity-Lift-task with v1.0, it can not get similar performance to v0.3. I use the same TD3 code, will get good performance in v0.3, as following:

but achieve poor performance in v1.0: When I render the training step, and found that the gripper is closed and touched to cube, however the gripper can not learn to grasp. Even the gripper had contacted to the cube, the cube is hard to get away from the table?

So I was wondering if I could find it in your benchmark. And then I test the Sawyer-OSC-POSE-Lift-TD3, and have the similar results with Sawyer-Joint-Velocity-Lift-TD3.

This is a bit of a puzzle. Are there any new changes to gripper and control in the new version?

roberto-martinmartin commented 4 years ago

Hi Kaixindelele, there are many changes in control in v1. We implemented all controllers as explicit transformations from the policy action space (for example, joint velocities) to the commands that real robots usually take, joint torques. This was done by Mujoco internally somehow in v0.3 but we don't know how. Now, we have full control of the gains that control the transformation between velocities (or any other space) and torques. We tried to map to what Mujoco reports as internal weights, but we cannot assure they are the same. However, the differences you observed are significantly large to be caused by slightly different weights.

@cremebrule may have some better information about the changes in the gripper/gripping functionality. I think there is some differences in how the contact/gripping state are computed that may explain the divergences.

"Even the gripper had contacted to the cube, the cube is hard to get away from the table?" --> Could you explain a bit further this comment? What does that mean? Thanks.

"And then I test the Sawyer-OSC-POSE-Lift-TD3, and have the similar results with Sawyer-Joint-Velocity-Lift-TD3." --> Does this mean that OSC works as bad as joint velocity in v1 or that there is divergence between v0.3 and v1.0.?

Thanks for the info!

kaixindelele commented 4 years ago

Hi, @roberto-martinmartin , have you the exact ablation study about the new and old version tasks? For example, Sawyer-Joint-Velocity-Lift-SAC-v1.0 and Sawyer-Joint-Velocity-Lift-SAC-v0.3, is there some difference between of them?

"Even the gripper had contacted to the cube, the cube is hard to get away from the table?" --> Could you explain a bit further this comment? What does that mean? Thanks.

"And then I test the Sawyer-OSC-POSE-Lift-TD3, and have the similar results with Sawyer-Joint-Velocity-Lift-TD3." --> Does this mean that OSC works as bad as joint velocity in v1 or that there is divergence between v0.3 and v1.0.?

These two problems may not be a big problem, if there is a difference for RL training performance between the old and the new versions.

Another question, when I test the Sawyer-OSC-POSE-Lift-TD3, and find this a kind of position increment control . Set a vertical downward pose, let the arm move alone with y-axis, it will go, and when it reaches the limit of the specified posture, it will change its posture and twist and run to the way. My test value is 0.5,

test-value05

if the test value is 0.2, then the phenomenon is not so serious. test-value02

That is really strange. The robot in the real world will never have such a control mode. Is this because a good limit is not set? Or when the torques is bigger, the limit would be easy to break?

roberto-martinmartin commented 4 years ago

We do not have the ablation study. It may be interesting. However, we are providing support for v1.0. So, if there were problems with v0.3, we may not be fixing them backwards. We are happy to look to any issue found in v1 and fix it.

I'm not sure I understand your new issue/question. "My test value is 0.5" Value of what? Of the displacement increment per time step? Then you will reach the limits of the robot very quickly. If the robot reaches joint limits, the behavior with OSC is undefined. The robot in real world would exactly do that, that is the behavior of operational space control (OSC). The OSC formalism abstracts away the kinematics of the robot, including joint limits, and tries to minimize kinematic energy in the motion. But if you reach you reach joint limit, the behavior is not predefined because they are not in the model. This is common in all real-robots I've used that use OSC resolving with torques. It is the role of the "user" of OSC (you, your policy) to not reach joint limits, or you can try to avoid it with null-space task, but won't help if your main task forces the robot out of the workspace. The weird thing is that with 0.2m of increments, the Sawyer moves so slowly. If I understand your experiment, it should try to move 0.2m per step and reach joint limits quickly. We could look into that.

Also, having a very large control step is not a good idea in general. It will create instabilities in the controller. All of this may sound too restrictive, but that is real control in real robots and that's one of the reasons we included it in v1, to mimic better what actually executes on the real robot.

"The robot in the real world will never have such a control mode." Not sure what this means. All our robots in the lab use operational space control as we provide in robosuite v1.0. "Is this because a good limit is not set?" Yes, we do not limit anything. the user should not try to go out of limits. "Or when the torques is bigger, the limit would be easy to break?" what limit? the position limit? it doesn't need large torques, just to go to the limits of the workspace

Thanks for reporting!

kaixindelele commented 4 years ago

Thanks for your reply! I just need limit the workspace, and keep the control step not very bigger~

cremebrule commented 4 years ago

Hi @kaixindelele ,

Sorry for a late response -- a couple point to add to what roberto alreaady mentioned:

Grasping detection is not actually handled properly currently -- however, this is a legacy issue from v0.3 and I am currently fixing this bug in v1.0. Essentially, in the Lift task, the grasping reward is given if both of the robot's finger geoms are in contact with the cube. While this may seem sensible, it's problematic because, with its current implementation, the robot can now cheat by simply touching the cube with both fingers, as you've observed. I am fixing this bug now such that the correct part of the fingers (the fingerpads exclusively) will trigger the grasping reward.
You can set absolute limits for any controller, except IK. For example, I am assuming you are using OSC POSE -- this would be the position_limits and orientation_limits in the corresponding .json controller config file. This would result in the desired behavior you want, preventing any actions that would cause the arm to move outside of its limits. Note that, however, (1) these are specified in the world frame, NOT the controller frame, and (2) there's actually a bug where the limits are incorrectly being passed as a list. This is fixed in the local bug fix branch we'll hopefully be pushing later this weekend, but you can immediately fix this by simply wrapping L176 and L177 with np.array() each.

Hope this helps!

kaixindelele commented 4 years ago

Hi Josiah, an old question for you, because you're responsible for Robosuite-Benchmark, have you the exact ablation study about the new and old version tasks? For example, Sawyer-Joint-Velocity-Lift-SAC-v1.0 and Sawyer-Joint-Velocity-Lift-SAC-v0.3, is there some difference between of them?

ARISE-Initiative / robosuite

Any difference between of v1.0 and v0.3? #124