reopening #50 - End effector delta

bycn commented 4 years ago

Continuation of #50 : There are still errors with small delta (0.01). It is also really slow - taking around 2 seconds to find a path. Is this expected behavior?

stepjam commented 4 years ago

Hi, Is this regarding DELTA_EE_POSE_PLAN or DELTA_EE_POSE? Given that you are saying: "to find a path", I'll assume you are using DELTA_EE_POSE_PLAN. This action mode uses path planing, so yes, as with any path planning its going to take longer than simply using the DELTA_EE_POSE or DELTA_EE_VELOCITY, etc.

Few questions to get down to the problem:

How are you generating these poses?
Are you using both translation and rotation? If yes, then are your quaternion actions valid?

bycn commented 4 years ago

I’m sampling x,y,z each from -1 to 1, and dividing by 100. I then use this as the delta to update. I’m only changing x,y,z - quaternion i feed in the same from the original pose.

stepjam commented 4 years ago

Hi,

quaternion i feed in the same from the original pos

This is your problem. Remember, you are using the DELTA action mode, rather than the ABS action mode. So to get a zero rotation you need to pass in a identity quaternion. So what is probably currently happening is that you are giving a rotation that is invalid.

bycn commented 4 years ago

No, I edited in the delta_ee_pose code to set the resulting action to just use the same quaternion without modifyinh

stepjam commented 4 years ago

If you've modified the backend then I need to see your changes. Please post the changed lines here

bycn commented 4 years ago

arm_action ~ sampled from -1 to 1 each
a_x, a_y, a_z = arm_action/100
x, y, z, qx, qy, qz, qw = self._robot.arm.get_tip().get_pose()
new_pose = [a_x + x, a_y + y, a_z + z] + [qx,qy,qz,qw]
self._path_observations = []
self._path_observations = self._path_action(list(new_pose))

stepjam commented 4 years ago

Thanks. And how many steps does it run for before you get a (what I assume is) InvalidActionError?

bycn commented 4 years ago

So If i step with [1,1,1,1(gripper)] which is [0.01...] after the division, then I get

59, in get_nonlinear_path
    raise ConfigurationPathError('Could not create path.')
pyrep.errors.ConfigurationPathError: Could not create path.

on the first step. If i step with 0.1 which is [0.001...] after the division, the error occurs on 8th step.

bycn commented 4 years ago

BTW, as an aside, I believe:

self._path_observations = []
self._path_observations = self._path_action(list(new_pose))

can just be written as self._path_observations = self._path_action(new_pose)

bycn commented 4 years ago

Also, did some more digging and the 2 seconds only occurs when it can't find a path, when it does find one it's about 0.2 seconds.

stepjam commented 4 years ago

So If i step with [1,1,1,1(gripper)] which is [0.01...] after the division, then I get
59, in get_nonlinear_path
    raise ConfigurationPathError('Could not create path.')
pyrep.errors.ConfigurationPathError: Could not create path.
on the first step. If i step with 0.1 which is [0.001...] after the division, the error occurs on 8th step.

This makes sense, right? If you look at the starting configuration of the arm, the end effector cant really increase its z axis anymore without also altering the rotation, and so it will not be able to find a valid configuration. If you were to negate the z axis (which would send the end effector down), then you would find that it would run for longer before getting the error.

stepjam commented 4 years ago

Also, did some more digging and the 2 seconds only occurs when it can't find a path, when it does find one it's about 0.2 seconds.

This is because it keeps trying to find a path/configuration until some max_attempts limit.

bycn commented 4 years ago

Ah, thanks Stephen, makes sense.

stepjam commented 4 years ago

So just to elaborate: I'm assuming you are doing some RL? if that's the case, then when in this action mode:

Your random exploration needs to be a little smarter than just random (i.e. accounting for the fact that the robot has a certain configuration space)
Your agent also needs to be aware of this (i.e. by penalising for suggesting actions outside of this configuration space).

Hope that helps :)

bycn commented 4 years ago

Yeah, I'm trying to set up an env to match Mujoco's Fetch Reach task to baseline some earlier experiments on RLBench. So, most likely rather than configuring the agent for now I'm trying to adjust the environment to be similar. Thanks again for the help! It'd be great if you could take a look at the stepping speeds in #53 , since without faster speeds we won't be able to use this environment :(

stepjam commented 4 years ago

No probs :+1:

Alvinosaur commented 3 years ago

I'm not sure if this was addressed in other areas, but our project group ran into this issue and got training to run smoothly by manually forcing change in EE position to have a maximum magnitude. See here for more details.

These errors can still happen though, especially when the model has not converged. We handled two errors separately. ConfigurationPathError implies the desired action (delta EE pos) is too large. InvalidActionError implies the desired actions brings the EE outside its configuration space (ex: literally reaching beyond the arm's max reach). Please see here for how we handled this.

Hope this helps anyone facing these issues.

mch5048 commented 3 years ago

@Alvinosaur Your team's solution seems reasonable!

Did your team have investigated the IK solvers of other RL platform like Pybullet, Robosuite, ... ?

I think addressing these IK errors occurred from explorative Cartesian space 6 DoF actions is a big challenge in robot learning domain...

stepjam / RLBench

reopening #50 - End effector delta #52