NVlabs / DiffRL

[ICLR 2022] Accelerated Policy Learning with Parallel Differentiable Simulation
https://short-horizon-actor-critic.github.io/
Other
250 stars 40 forks source link

[BUG] Initialization Velocities Scale with Distance from the Origin #1

Open zalo opened 2 years ago

zalo commented 2 years ago

@ViktorM

While writing a MuJoCo-Viewer Renderer for DiffRL, I noticed that initialization velocities appear to scale the farther out an actor is from the origin.

This behavior appears to occur in the original .usd's as well, so I don't believe it's an artifact of the visualization.

https://user-images.githubusercontent.com/174475/187368724-206c2601-4180-40e8-b18b-9bda2460ad11.mp4

The early termination threshold also seems more sensitive far away from the origin.

I tested setting the stochastic initialization velocity to a constant and the velocities were still amplified far away from the origin, so I believe this is something deeper within the fundaments of dflex...

ViktorM commented 1 year ago

Hi @zalo,

Sorry for the late response. Great job with Mujoco viewer! It looks very nice. As a workaround, I can propose placing all the ants at the same start pose and render with displacement only their visual meshes.

@mmacklin could it be because of the way in dflex joint_qd are not in the center of mass in a world frame, but in a twist representation?

# convert the linear velocity of the torso from twist representation to the velocity of the center of mass in world frame
lin_vel = lin_vel - torch.cross(torso_pos, ang_vel, dim = -1)
zalo commented 1 year ago

Ah, interesting! That does indeed seem to fix the issue.

I also note that the separation of the agents is disabled when not visualizing: https://github.com/NVlabs/DiffRL/blob/main/envs/ant.py#L94-L97

Perhaps this behavior is a known issue?

zalo commented 1 year ago

Dividing initial velocities by their distance from the origin appears to normalize the velocities across the spread, even in visualization.

I tested this by appending :

/ torch.clamp_min(self.start_pos[env_ids, 2:3], 1.0)

to the end of this line, but using the magnitude is probably preferable. 👍