This is official implementation of Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement in CVPR 2024.
Ziyu Wang *, Yue Xu *, Cewu Lu and Yong-Lu Li
* Equal contribution
In this work, we provide the first systematic study of video distillation and introduce a taxonomy to categorize temporal compression. It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block.
If there are any questions, please contact me(wangxiaoyi2021@sjtu.edu.cn).
Our method is a plug-and-play module.
git clone git@github.com:yuz1wan/video_distillation.git
cd video_distillation
distill_utils
├── data
│ ├── HMDB51
│ │ ├── hmdb51_splits.csv
│ │ └── jpegs_112
│ ├── Kinetics
│ │ ├── broken_videos.txt
│ │ ├── replacement
│ │ ├── short_videos.txt
│ │ ├── test
│ │ ├── test.csv
│ │ ├── train
│ │ ├── train.csv
│ │ ├── val
│ │ └── validate.csv
│ ├── SSv2
│ │ ├── frame
│ │ ├── annot_train.json
│ │ ├── annot_val.json
│ │ └── class_list.json
│ └── UCF101
│ ├── jpegs_112
│ │ ├── v_ApplyEyeMakeup_g01_c01
│ │ ├── v_ApplyEyeMakeup_g01_c02
│ │ ├── v_ApplyEyeMakeup_g01_c03
│ │ └── ...
│ ├── UCF101actions.pkl
│ ├── ucf101_splits1.csv
│ └── ucf50_splits1.csv
└── ...
Baseline.
For full-dataset training, you can use the dataloaders in distill_utils/dataset.py and evaluate_synset function(with mode = 'none') in utils.py.
For coreset selection strategy, we refer to k-center baseline for k-center strategy and herding baseline for herding strategy. Our implementation is in distill_coreset.py.
Static Learning.
We use DC for static learning. You can find DC code in this repo and we provide code to load single frame data at utils.py. singleUCF50, singleHMDB51, singleKinetics400, singleSSv2 are for static learning. You can use them just like MNIST in DC.
Or you can use static memory trained by us.
Dynamic Fine-tuning.
We have thoroughly documented the parameters employed in our experiments in Suppl.
For DM/DM+Ours
cd sh/baseline
# bash DM.sh GPU_num Dateset Learning_rate IPC
bash DM.sh 0 miniUCF101 30 1
# for DM+Ours
cd ../s2d
# for ipc=1
bash s2d_DM_ms.sh 0,1,2,3 miniUCF101 1e-4 1e-5
# for ipc=5
bash s2d_DM_ms_5.sh 0,1,2,3 miniUCF101 1e3 1e-6
For MTT/MTT+Ours, it is necessary to first train the expert trajectory with buffer.py (refer MTT).
cd sh/baseline
# bash buffer.sh GPU_num Dateset
bash buffer.sh 0 miniUCF101
# bash MTT.sh GPU_num Dateset Learning_rate IPC
bash MTT.sh 0 miniUCF101 1e5 1
cd ../s2d
# for ipc=1
bash s2d_MTT_ms.sh 0,1,2,3 miniUCF101 1e4 1e-3
# for ipc=5
bash s2d_MTT_ms_5.sh 0,1,2,3 miniUCF101 1e4 1e-3
This work is built upon the code from
We also thank the Awesome project.