Alescontrela / viper_rl

Using advances in generative modeling to learn reward functions from unlabeled videos.
MIT License
111 stars 12 forks source link

About the video model #7

Closed yingchengyang closed 3 months ago

yingchengyang commented 7 months ago

Thanks for such a wonderful work! I'm curious about the video model. What are the datasets used for the training of the video model? In my opinion, in the dmc setting, the video model will use expert trajectories of all dmc tasks, like walker-walk, and cheetah-run. Is it right? If so, how can the model generate videos of different tasks with the same embodiment (like walker-stand and walker-walk)?

Thanks again!

Alescontrela commented 3 months ago

The video model samples consecutive frames so sampled trajectories for the same embodiment will represent different tasks. One can condition on a task ID or text to sample a video of a particular task.