NVlabs / ProtoMotions

Other
346 stars 25 forks source link

Does the generated trajectory correspond to the mition file? #22

Closed BRoln7 closed 1 week ago

BRoln7 commented 2 weeks ago

I wonder in training phrase, is there a correspondence between trajectory and the motion from motion file? For example, a fast forward trajectory corresponds to a running motion in motion file.

tesslerc commented 1 week ago

For discriminative (AMP-based) methods the behavior you expect to see is in the style of the motions you provided in the motion file. The trajectories we sample for the path follower, or the speeds for the steering tasks do not necessarily correspond to the data.

So if your data distribution does not match the task distribution, the motion might not look good as the agent is required to solve tasks it has no demonstrations for.

If you add an option for matching the task specifications with those extracted from the data I'd be more than happy to merge that into the main branch!

BRoln7 commented 1 week ago

I apologize for not making myself clear. As far as I know, in the AMP task, the robot randomly draws actions from the motion file for imitation learning. However, in path-following tasks, the motions in the motion file should be combined with some regularity. My question is whether the regularity of the combination is somehow related to the generated path at that time? For example, if the current path is a path generated towards the back of the humanoid, then the robot should first learn the motion to turn backward and then learn the normal walking motion?

tesslerc commented 1 week ago

Thanks for clarifying. So in all the AMP-based tasks the task and motion are agnostic to one another. The motion provides a prior on what the style should look like and the task rewards tells it what to solve. If the demonstration data is far from what the task would expect, the behavior could look bad. For example, if you only provide crawling motions but the task wants it to run. Then the behavior could be unexpected. However, if you'd provide several examples of a human running, it will typically generate running motion in the styles you provided.

The implementation of AMP expects the stationary distribution induced by the policy to match that of the data. It doesn't tell the agent when it should reproduce which part of a motion, but rather that the behavior of the agent should be indistinguishable from the data you provide it.

I think it could be interesting to extract the characteristics from the provided data and make sure the task requests characteristics that aren't too far from the examples. So, if you only provided walking motions, it won't ask you to run. If all motions are upright, it won't ask you to crawl. etc...

BRoln7 commented 1 week ago

I see. Thank you for your reply!