[CVPR 2024] | LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
This repository is the official implementation of [LAMP]()
LAMP: Learn A Motion Pattern for Few-Shot Video Generation
Ruiqi Wu, Linagyu Chen, Tong Yang, Chunle Guo, Chongyi Li, Xiangyu Zhang
( * indicates corresponding author)
[Arxiv Paper] [Website Page] [Google Drive] [Baidu Disk (pwd: ffsp)] [Colab Notebook]
:rocket: LAMP is a few-shot-based method for text-to-video generation. You only need 8~16 videos 1 GPU (> 15 GB VRAM) for training!! Then you can generate videos with learned motion pattern.
# clone the repo
git clone https://github.com/RQ-Wu/LAMP.git
cd LAMP
# create virtual environment
conda create -n LAMP python=3.8
conda activate LAMP
# install packages
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
pip install xformers==0.0.13
You can download pre-trained T2I diffusion models on Hugging Face.
In our work, we use Stable Diffusion v1.4 as our backbone network. Clone the pretrained weights by git-lfs
and put them in ./checkpoints
Our checkpoint and training data are listed as follows. You can also collect video data by your own (Suggest websites: pexels, frozen-in-time) and put .mp4 files in ./training_videos/[motion_name]/
[Update] You can find the training video for video editing demo in assets/run.mp4
Motion Name | Checkpoint Link | Training data |
Birds fly | Baidu Disk (pwd: jj0o) | Baidu Disk (pwd: w96b) |
Firework | Baidu Disk (pwd: wj1p) | Baidu Disk (pwd: oamp) |
Helicopter | Baidu Disk (pwd: egpe) | Baidu Disk (pwd: t4ba) |
Horse run | Baidu Disk (pwd: 19ld) | Baidu Disk (pwd: mte7) |
Play the guitar | Baidu Disk (pwd: l4dw) | Baidu Disk (pwd: js26) |
Rain | Baidu Disk (pwd: jomu) | Baidu Disk (pwd: 31ug) |
Turn to smile | Baidu Disk (pwd: 2bkl) | Baidu Disk (pwd: l984) |
Waterfall | Baidu Disk (pwd: vpkk) | Baidu Disk (pwd: 2edp) |
All | Baidu Disk (pwd: ifsm) | Baidu Disk (pwd: 2i2k) |
# Training code to learn a motion pattern
CUDA_VISIBLE_DEVICES=X accelerate launch train_lamp.py config="configs/horse-run.yaml"
# Training code for video editing (The training video can be found in assets/run.mp4)
CUDA_VISIBLE_DEVICES=X accelerate launch train_lamp.py config="configs/run.yaml"
Here is an example command for inference
# Motion Pattern
python inference_script.py --weight ./my_weight/turn_to_smile/unet --pretrain_weight ./checkpoints/stable-diffusion-v1-4 --first_frame_path ./benchmark/turn_to_smile/head_photo_of_a_cute_girl,_comic_style.png --prompt "head photo of a cute girl, comic style, turns to smile"
# Video Editing
python inference_script.py --weight ./outputs/run/unet --pretrain_weight ./checkpoints/stable-diffusion-v1-4 --first_frame_path ./bemchmark/editing/a_girl_runs_beside_a_river,_Van_Gogh_style.png --length 24 --editing
#########################################################################################################
# --weight: the path of our model
# --pretrain_weight: the path of the pre-trained model (e.g. SDv1.4)
# --first_frame_path: the path of the first frame generated by T2I model (e.g. SD-XL)
# --prompt: the input prompt, the default value is aligned with the filename of the first frame
# --output: output path, default: ./results
# --height: video height, default: 320
# --width: video width, default: 512
# --length video length, default: 16
# --cfg: classifier-free guidance, default: 12.5
#########################################################################################################
Horse run | |||
A horse runs in the universe. | A horse runs on the Mars. | A horse runs on the road. | |
Firework | |||
Fireworks in desert night. | Fireworks over the mountains. | Fireworks in the night city. | |
Play the guitar | |||
GTA5 poster, a man plays the guitar. | A woman plays the guitar. | An astronaut plays the guitar, photorealistic. | |
Birds fly | |||
Birds fly in the pink sky. | Birds fly in the sky, over the sea. | Many Birds fly over a plaza. |
Origin Videos | Editing Result-1 | Editing Result-2 |
A girl in black runs on the road. | A man runs on the road. | |
A man is dancing. | A girl in white is dancing. |