Source video | Result |
---|---|
Enhancement
.Video editing
.Pose driving
,Expression driving
and Pose and Expression driving
.python run_demo.py --s_path ./data/s.mp4 \
--d_path ./data/d.mp4 \
--model_path ./checkpoints/dpe.pt \
--face exp \
--output_folder ./res
python run_demo.py --s_path ./data/s.mp4 \
--d_path ./data/d.mp4 \
--model_path ./checkpoints/dpe.pt \
--face pose \
--output_folder ./res
Video driving:
python run_demo.py --s_path ./data/s.mp4 \
--d_path ./data/d.mp4 \
--model_path ./checkpoints/dpe.pt \
--face both \
--output_folder ./res
One-shot driving:
python run_demo_single.py --s_path ./data/s.jpg \
--pose_path ./data/pose.mp4 \
--exp_path ./data/exp.mp4 \
--model_path ./checkpoints/dpe.pt \
--face both \
--output_folder ./res
python crop_video.py
Before video editing, you should run python crop_video.py
to process the input full video.
For pre-trained segmentation model, you can download from here and put it in ./checkpoints.
(Optional) You can run git clone https://github.com/TencentARC/GFPGAN
and download the pre-trained enhancement model from here and put it in ./checkpoints. Then you can use --EN
to make the result better.
python run_demo_paste.py --s_path <cropped source video> \
--d_path <driving video> \
--box_path <txt after running crop_video.py> \
--model_path ./checkpoints/dpe.pt \
--face exp \
--output_folder ./res \
--EN
TODO
To train DPE, please follow video-preprocessing
to download and pre-process the VoxCelebA dataset. We use the lmdb
to improve I/O efficiency.
(Or you can rewrite the Class VoxDataset
in dataset.py
to load data with .mp4
directly.)
Train DPE from scratch:
python train.py --data_root <DATA_PATH>
(Optional) If you want to accelerate convergence speed, you can download the pre-trained model of LIA and rename it to vox.pt
.
python train.py --data_root <DATA_PATH> --resume_ckpt <model_path for vox.pt>
If you find our work useful in your research, please consider citing:
@InProceedings{Pang_2023_CVPR,
author = {Pang, Youxin and Zhang, Yong and Quan, Weize and Fan, Yanbo and Cun, Xiaodong and Shan, Ying and Yan, Dong-Ming},
title = {DPE: Disentanglement of Pose and Expression for General Video Portrait Editing},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {427-436}
}
Part of the code is adapted from LIA, PIRenderer, STIT. We thank authors for their contribution to the community.
This is not an official product of Tencent. This repository can only be used for personal/research/non-commercial purposes.