tonyzhaozh / act

MIT License
751 stars 173 forks source link

Why use leader robots joint positions as Supervision rather than the follow's #22

Open Jessy-Huang opened 8 months ago

Jessy-Huang commented 8 months ago

In your paper, you mention:

We record the joint positions of the leader robots (i.e. input from the human operator) and use them as actions. It is important to use the leader joint positions instead of the follower's, because the amount of force applied is implicitly defined by the difference between them, through the It is important to use the leader joint positions instead of the follower's, because the amount of force applied is implicitly defined by the difference between them, through the The low-level PID controller.

How to understand it? What if I use the qpos(the follow's joint positions?

xyzacademic commented 7 months ago

I guess it is because the leader robots joint position is the signal which be sent to follower robot to execute. For example, we get potion A from leader robot, and send it to follower robot, the robot will go to A'. So if we want the robot to go to A', we should send A to it, rather than A'. That is why A is the target groundtruth.

bdelhaisse commented 5 months ago

@Jessy-Huang let's say we are operating in the Cartesian space and we are trying to push an object with a robot gripper. In order to move it, we need to apply a certain force threshold to that object so that it starts to move.

Let's say we touch the object at t=0 and we increasingly apply a certain force until at t=T the object finally starts to move. Here is what happened, between 0 and T, the follower/measured positions were constants but the leader/target positions were going deeper and deeper in the object until the difference between the target and measured positions was enough to generate enough forces to move the object. It is the responsibility of the P(I)D controllers to translate the difference in positions into forces.

Now, if we record the follower/measured positions and we try to replay these (i.e. these recorded measured positions become now our target positions), what happens? Well, between 0 and T, our target positions are going to be constant and the difference between these and the currently measured positions will be close to 0 so at the end we won't be generating any forces. This is already different from what happened above where we were applying an increasing force to the object until it finally moved at t=T. You can then imagine what will happen after T depending on the difference between the recorded measured positions (which are now our target positions) and our currently measured positions.

Hope this helps.