RuiningLi / puppet-master

Official Implementation of Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
https://vgg-puppetmaster.github.io/
76 stars 3 forks source link
video-generation

Puppet-Master

Official implementation of 'Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics'

[arXiv] [Demo] [Project] [BibTeX]

News

Examples

Man-Made Objects

Man-Made Objects

Animals

Animals

Humans

Humans

Objaverse-Animation & Objaverse-Animation-HQ

See the data folder.

Training

We provide a minimum viable training script to demonstrate how to use our dataset to fine-tune Stable Video Diffusion.

You can use the following command:

accelerate launch --num_processes 1 --mixed_precision fp16 train.py --config configs/train-puppet-master.yaml

To reduce the memory overhead, we cache all the latents and CLIP embeddings of the rendered frames.

Note this is only a working example. Our final model is trained using a combined dataset of Objaverse-Animation-HQ and Drag-a-Move.

Inference

We provide an interactive demo here. Check it out!

Evaluation

Our evaluation utilizes an unseen test set of Drag-a-Move, consisting of 100 examples. The whole test set is provided in DragAMove-test-batches folder. The test examples can be read directly from the xxxxx.pkl files and are in the same format as those loaded from DragVideoDataset implemented in dataset.py.

TODO

Citation

@article{li2024puppetmaster,
  title     = {Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics},
  author    = {Li, Ruining and Zheng, Chuanxia and Rupprecht, Christian and Vedaldi, Andrea},
  journal   = {arXiv preprint arXiv:2408.04631},
  year      = {2024}
}