Example-Guided Style-Consistent Image Synthesis from Semantic Labeling
Miao Wang1, Guo-Ye Yang2, Ruilong Li2, Run-Ze Liang2, Song-Hai Zhang2, Peter M. Hall3 and Shi-Min Hu2,1
1State Key Laboratory of Virtual Reality Technology and Systems, Beihang University
2Department of Computer Science and Technology, Tsinghua University, Beijing
3University of Bath
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Task name: face
We use the real videos in the FaceForensics dataset, which contains 854 videos of reporters broadcasting news. We localize facial landmarks, crop facial regions and resize them to size 256×256. The detected facial landmarks are connected to create face sketches.
Task name: pose
We download 150 solo dance videos from YouTube, crop out the central body regions and resize them to 256×256. We evenly split each video into the first part and the second part along the time-line, then sample training data only from the first parts and sample testing data only from the second parts of all the videos. The the labels are created using concatenated pre-trained DensePose and OpenPose pose detection results.
Task name: scene
We use the BDD100k dataset to synthesize street view images from pixelwise semantic labels (i.e. scene parsing maps). We use the state-of-the-art scene parsing network DANet to create labels.
git clone [this project]
cd pix2pixSC
# download datas.zip at https://drive.google.com/drive/folders/1O94UcCXONq7p2ZiPcfi-dldjREQ-GsJK or https://share.weiyun.com/5lHBkE0
unzip datas.zip
mv datas/checkpoints ./
mv datas/datasets ./
# scripts below is optional
mkdir ../FaceForensics
download FaceForensics dataset to ../FaceForensics/datas
python process.py
python generate_data_face_forensics.py --source_path '../FaceForensics/out_data' --target_path './datasets/FaceForensics3/' --same_style_rate 0.3 --neighbor_size 10 --A_repeat_num 50 --copy_data
new_scripts/train_[Task name].sh
new_scripts/test_[Task name].sh
Edit inference/infer_list.txt, one test case each line, outputs of each test case will be in ./results.
new_scripts/infer_face.sh
Inference code of other tasks will come later.
If you find this useful for your research, please cite the following paper.
@InProceedings{pix2pixSC2019,
author = {Wang, Miao and Yang, Guo-Ye and Li, Ruilong and Liang, Run-Ze and Zhang, Song-Hai and Hall, Peter. M and Hu, Shi-Min},
title = {Example-Guided Style-Consistent Image Synthesis from Semantic Labeling},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}