huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.19k stars 5.21k forks source link

MotionMaster: Training-free Camera Motion Transfer For Video Generation #7864

Open clarencechen opened 4 months ago

clarencechen commented 4 months ago

Model/Pipeline/Scheduler description

Currently, most existing camera motion control methods for video generation with denoising diffusion models rely on training a temporal camera module, and necessitate substantial computation resources due to the large amount of parameters in video generation models.

The authors of MotionMaster, a novel training-free video motion transfer model, first disentangling camera and object motion embeddings extracted from temporal attention maps during the DDIM inversion of the source video(s), and then transferring the extracted camera motion to new videos through two methods:

Finally, the authors demonstrate the linearity and spatial-token decomposability of the latent space of camera motion features formed by the extracted temporal attention maps, enabling further flexibility in combining and altering camera motion features before injection into target videos.

Open source status

Provide useful links for the implementation

Github: https://github.com/sjtuplayer/MotionMaster Paper: https://arxiv.org/pdf/2404.15789 Project Website: https://sjtuplayer.github.io/projects/MotionMaster/ Main author: @sjtuplayer

github-actions[bot] commented 4 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.