Parser-Free Virtual Try-on via Distilling Appearance Flows, CVPR 2021

Official code for CVPR 2021 paper 'Parser-Free Virtual Try-on via Distilling Appearance Flows'

The training code has been released.

[Paper] [Supplementary Material] [Sota website]

[Checkpoints for Test]

[Training_Data] [Test_Data]

[VGG_Model]

Our Environment

anaconda3

pytorch 1.1.0

torchvision 0.3.0

cuda 9.0

cupy 6.0.0

opencv-python 4.5.1

8 GTX1080 GPU for training; 1 GTX1080 GPU for test

python 3.6

Training on VITON dataset

cd PF-AFN_train
Download the VITON training set from VITON_train and put the folder "VITON_traindata" under the folder "dataset".
Dowload the VGG_19 model from VGG_Model and put "vgg19-dcbb9e9d.pth" under the folder "models".
First train the parser-based network PBAFN. Run scripts/train_PBAFN_stage1.sh. After the parser-based warping module is trained, run scripts/train_PBAFN_e2e.sh.
After training the parser-based network PBAFN, train the parser-free network PFAFN. Run scripts/train_PFAFN_stage1.sh. After the parser-free warping module is trained, run scripts/train_PFAFN_e2e.sh.
Following the above insructions with the provided training code, the [trained PF-AFN] achieves FID 9.92 on VITON test set with the test_pairs.txt (You can find it in https://github.com/minar09/cp-vton-plus/blob/master/data/test_pairs.txt).

Run the demo

cd PF-AFN_test
First, you need to download the checkpoints from checkpoints and put the folder "PFAFN" under the folder "checkpoints". The folder "checkpoints/PFAFN" shold contain "warp_model_final.pth" and "gen_model_final.pth".
The "dataset" folder contains the demo images for test, where the "test_img" folder contains the person images, the "test_clothes" folder contains the clothes images, and the "test_edge" folder contains edges extracted from the clothes images with the built-in function in python (We saved the extracted edges from the clothes images for convenience). 'demo.txt' records the test pairs.
During test, a person image, a clothes image and its extracted edge are fed into the network to generate the try-on image. No human parsing results or human pose estimation results are needed for test.
To test with the saved model, run test.sh and the results will be saved in the folder "results".
To reproduce our results from the saved model, your test environment should be the same as our test environment, especifically for the version of cupy.

Dataset

VITON contains a training set of 14,221 image pairs and a test set of 2,032 image pairs, each of which has a front-view woman photo and a top clothing image with the resolution 256 x 192. Our saved model is trained on the VITON training set and tested on the VITON test set.
To train from scratch on VITON training set, you can download VITON_train.
To test our saved model on the complete VITON test set, you can download VITON_test.

License

The use of this code is RESTRICTED to non-commercial research and educational purposes.

Acknowledgement

Our code is based on the implementation of "Clothflow: A flow-based model for clothed person generation" (See the citation below), including the implementation of the feature pyramid networks (FPN) and the ResUnetGenerator, and the adaptation of the cascaded structure to predict the appearance flows. If you use our code, please also cite their work as below.

Citation

If our code is helpful to your work, please cite:

@article{ge2021parser,
  title={Parser-Free Virtual Try-on via Distilling Appearance Flows},
  author={Ge, Yuying and Song, Yibing and Zhang, Ruimao and Ge, Chongjian and Liu, Wei and Luo, Ping},
  journal={arXiv preprint arXiv:2103.04559},
  year={2021}
}

@inproceedings{han2019clothflow,
  title={Clothflow: A flow-based model for clothed person generation},
  author={Han, Xintong and Hu, Xiaojun and Huang, Weilin and Scott, Matthew R},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={10471--10480},
  year={2019}
}

f-lab-edu / virtual-try-on

readme