Improbable-AI / walk-these-ways

Sim-to-real RL training and deployment tools for the Unitree Go1 robot.
https://gmargo11.github.io/walk-these-ways/
Other
488 stars 129 forks source link

I have confusion about foot clearance. #46

Closed JihoonPark20 closed 11 months ago

JihoonPark20 commented 11 months ago

Hello, Thank you for sharing this great work.

I have some question about foot clearance.

def _reward_feet_clearance_cmd_linear(self):
    phases = 1 - torch.abs(1.0 - torch.clip((self.env.foot_indices * 2.0) - 1.0, 0.0, 1.0) * 2.0)
    foot_height = (self.env.foot_positions[:, :, 2]).view(self.env.num_envs, -1) # - reference_heights
    target_height = self.env.commands[:, 9].unsqueeze(1) * phases + 0.02 # offset for foot radius 2cm
    rew_foot_clearance = torch.square(target_height - foot_height) * (1 - self.env.desired_contact_states)
    return torch.sum(rew_foot_clearance, dim=1)

This code is extracted from corl_reward.py. If I understand correctly, this function runs according to the swing phase of the legged system. However, the 'target_height' variable for this function appears to increase linearly within the specified period.

I think at the end of this cycle we need to zero the target_height again to prepare for the next stance phase, is there a reason why it's in the same form as the code above?

gmargo11 commented 11 months ago

Hi @JihoonPark20 ,

Glad you are enjoying the repository!

Printing the env.desired_contact_states and the target_height it seems to me that during the swing period, the target height first increases, then decreases, as desired.

If you see different behavior, can you please share the data or a code snippet to replicate it?

Thanks -Gabe

JihoonPark20 commented 11 months ago

Thanks for the replying.

At first, I thought target_heightwas given as a constant. So I thought it would be a pulse shape with desired_contact_state.

If I understand your explanation, rew_foot_clearance term increases and decreases according to phases term increases and 1-desired_contact_state decreases.

But I'm still confused about why the target_height is in the form of a first-order function of ( command * phases + bias ) for the phases term.

Is there anything I can refer to about this? Thank you very much.

gmargo11 commented 11 months ago

Hi @JihoonPark20 ,

The bias is there because the foot is a sphere with a radius of 2cm. self.env.foot_positions[:, :, 2] reads the height of the sphere's center. The target height corresponds to the height of the foot's lowest point, not its center. The bias of 0.02 implements this correction.

The command is the peak target height which should always occur in the middle of the swing phase.

The phases variable is 0 during the stance phase. During the swing phase, it linearly increases from 0 to 1, then linearly decreases from 1 to 0. So instead of a pulse shape, the target_height has a triangle shape during each swing phase.

-Gabe