nv-tlabs / ASE

Other
793 stars 128 forks source link

Latent code and numAMPObsSteps #32

Closed anonymous-pusher closed 1 year ago

anonymous-pusher commented 1 year ago

Hello Jason,

I have some questions regarding the encoding of transitions in the latent space. The paper describes the encoding of transitions between states at t and t+1. In practice however, you use multiple steps for both AMP and the encoding. I understand that it helps for learning complex behavior over long horizons (10 is the default here); For example, the humanoid in AMP cannot learn the backflip using only a transition of 2 steps. I think there might be two issues here though:

Thank you

xbpeng commented 1 year ago

Yes, you are right. By including multiple timesteps into the disc observation, the reward is not longer markovian. But this doesn't seems to be that bad. There doesn't seem to be a negative impact on performance (at least with relative short histories) and the motion quality tends to be better.

Regarding the latents for the encoder. Yes, the latent can change over the course of the rollout. This will just means that it might be impossible for the encoder to correctly predict the z for those timesteps. But it can still do a good job on the other timesteps when z is fixed. The latents are still fixed for most timesteps. So on average that might not be so bad.