minar09 / cp-vton-plus

Official implementation for "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On", CVPRW 2020
https://minar09.github.io/cpvtonplus/
MIT License
346 stars 120 forks source link

Bad results #26

Open snarb opened 3 years ago

snarb commented 3 years ago

Thank for the work. I have run the modules according to the read me, but most of results after the TOM stage ('try-on' generated folder) contains significant artifacts. Just attaching here first 5 generated photos. Some are ok, but I would say that most are bad. Don't understand is this a some bug, problem with my actions or current limitation of the approach. 000001_0 000010_0 000020_0 000028_0 000038_0

snarb commented 3 years ago

There are some weird samples in dataset. This is first close mask in the dataset. It contains two closes 000001_1 Images in image-mask folder often looks like combination of two photos.. 000020_0 Is this ok?

thaithanhtuan commented 3 years ago

Please refer to other issues for more information: https://github.com/minar09/cp-vton-plus/issues/9 And this: https://github.com/minar09/cp-vton-plus/issues/8 Somebody has the same problem and comes from a different OS return "listDir" function.

image Yes, In VITON dataset, there are some weird cases, you can simply remove them from train_pair.txt

vasujoshi111 commented 3 years ago

Hi, I was trying on custom images for prediction using pretrained models provided in repo. I was getting a bad results. Only for getting the keypoints, I used detectron2 and added one extra list like [[0,0,0]](As I was getting 17 keypoints). Changed the batch size to 1 also, and tried for gray scale images also. Please help me out in getting a better results. Keypoints json: {"people": [{"face_keypoints": [], "hand_left_keypoints": [], "hand_right_keypoints": [], "pose_keypoints": [115.9163818359375, 41.370147705078125, 0.9661189317703247, 126.4640884399414, 31.777069091796875, 4.307704448699951, 106.80697631835938, 31.29741668701172, 2.1659820079803467, 140.367919921875, 36.5736083984375, 0.78687983751297, 95.54009246826172, 35.8541259765625, 0.6354686617851257, 159.54559326171875, 95.57103729248047, 0.15054160356521606, 74.44464874267578, 90.77449035644531, 0.27401798963546753, 163.86058044433594, 168.47842407226562, 0.11921537667512894, 69.65023040771484, 150.01174926757812, 0.04564519599080086, 140.367919921875, 234.67066955566406, 0.21882663667201996, 98.41674041748047, 194.37973022460938, 0.0308991651982069, 126.4640884399414, 215.00485229492188, 0.07232855260372162, 71.3282699584961, 212.84640502929688, 0.0809921994805336, 114.7177734375, 251.93820190429688, 0.0634264126420021, 67.01329803466797, 251.93820190429688, 0.05943075194954872, 133.41600036621094, 251.93820190429688, 0.10058610886335373, 48.554779052734375, 251.93820190429688, 0.028173452243208885, 0, 0, 0]}], "version": 1.0}

Result 000010_0 Cloth 000028_1 Image 000010_0 (1) Image-parse 000010_0 Cloth-mask 000028_1 (1)

minar09 commented 3 years ago

Hi @vasujoshi111 , is the detectron2 joints ordering are same as openose? If not, you need to generate poses for your images with openpose coco-18 model.

vasujoshi111 commented 3 years ago

Thank you very much @minar09 , I need to know about this. I try to convert to the required format If not I will generate using openpose coco-18 model.

vasujoshi111 commented 3 years ago

Hi @minar09 , Even with viton data with pretrained model, also I am getting bad results. PFA result images. 000395_0 000477_0 000048_0 000183_0 000248_0

minar09 commented 3 years ago

Hi @vasujoshi111 , your results are obviously not as expected. Some people previously had sorting errors, but this one looks different. Please debug a little to see whether all the inputs are coming as expected. I think somehow the face-hair input is missing from your TOM network. Hope you can solve this. Good luck.

vasujoshi111 commented 3 years ago

Thanks @minar09 . I spotted the error. The error is in the parsed image. Please share the link or code to get the exact parse image as in vton dataset. I have used https://github.com/RohanBhandari/LIP_JPPNet to get the parse image. But that is not same as the one in vton dataset parse image.

minar09 commented 3 years ago

I think VITON dataset segmentation is generated with this: https://github.com/Engineering-Course/LIP_SSL Please check the original paper for more details.

vasujoshi111 commented 3 years ago

Thank you @minar09 . While debugging the code, in dataset_neck_skin_correction.py line no. 132 Image.open(seg_pth), our generated image which is exactly same as cp-vton data image having 3 channels. but cp-vton parse-image is having shape 256*192 in "P" mode. I converted our generated image to P mode and run. Result is not good. Changed to "L" mode then I got upper neck part. Every image is same till neck_mask in dataset_neck_skin_correction.py(line no. 181) but after adding the segmentation to neck mask I got different image as shown below. result After passing this image to decode_labels I am getting this one. decode How to fix this one?

minar09 commented 3 years ago

@vasujoshi111 , you don't need to run dataset_neck_skin_correction.py anymore. We already provide fully processed dataset for downloading in the readme file. You can directly move to training/testing.

vasujoshi111 commented 3 years ago

@minar09 , I am running each line of code in dataset_neck_skin_correction.py in order to know how can generate for custom images. In custom images prediction I got weird results. That's why I am trying to get how they have processed the image-parse and all. I have taken one of the image from vton data and trying to get the same images as in vton dataset and get your results.