neuralchen / SimSwap

An arbitrary face-swapping framework on images and videos with one single trained model!
Other
4.49k stars 885 forks source link

Why swapping with self-trained model produces weird white mask? #388

Open chenqiu1024 opened 1 year ago

chenqiu1024 commented 1 year ago

I tried to train model with such command: python train.py --name simswap224_test --continue_train False --which_epoch latest --batchSize 16 --gpu_ids 0 --Gdeep False --dataset vggface2_crop_arcfacealign_224 --beta1 0.5 --checkpoints_dir ./checkpoints --isTrain True --load_pretrain people --phase train --niter 10000 --niter_decay 10000 --lr 0.0002 --lambda_feat 10.0 --lambda_id 20.0 --lambda_rec 10.0 --total_step 810000 , in which most of the parameters were copied from the opt.txt in the pretrained 'people' model folder. It succeeded producing XXX_net_D.pth, XXX_net_G.pth, XXX_optim_D.pth and XXX_optim_G.pth files for each 10000 steps, and the sample pictures seem correct. However, by swapping with command like python test_video_swap_multispecific.py --crop_size 224 --use_mask --which_epoch 420000 --name simswap224_test --Arc_path arcface_model/arcface_checkpoint.tar --video_path ./demo_file/multi_people_1080p.mp4' --output_path ./output/multi_test_multispecific.mp4 --temp_path ./temp_results --multisepcific_dir ./demo_file/multispecific , the resulting video shows weird white masks on every recognized human faces. I have no idea about why this happens. Could someone helps? weird_mask

ForrestFzy commented 1 year ago

I have the same problem, but the output image is correct during training. when I attempt to integrate the code validated in train.py with test_wholeimage_swapsingle.py, the face becomes mosaic, do you have any ideas on how to fix it? image

leeyunjai82 commented 1 year ago

same problem, too

gg22mm commented 1 year ago

我试着用这样的命令训练模型: python train.py --name simswap224_test --continue_train False --which_epoch latest --batchSize 16 --gpu_ids 0 --Gdeep False --dataset vggface2_crop_arcfacealign_224 --beta1 0.5 --checkpoints_dir ./checkpoints --isTrain True --load_pretrain people --phase train --niter 10000 --niter_decay 10000 --lr 0.0002 --lambda_feat 10.0 --lambda_id 20.0 --lambda_rec 10.0 --total_step 810000 , in which most of the parameters were copied from the opt.txt in the pretrained 'people' model folder. It succeeded producing XXX_net_D.pth, XXX_net_G.pth, XXX_optim_D.pth and XXX_optim_G.pth files for each 10000 steps, and the sample pictures seem correct. However, by swapping with command like python test_video_swap_multispecific.py --crop_size 224 --use_mask --which_epoch 420000 --name simswap224_test --Arc_path arcface_model/arcface_checkpoint.tar --video_path ./demo_file/multi_people_1080p.mp4' --output_path ./output/multi_test_multispecific.mp4 --temp_path ./temp_results --multisepcific_dir ./demo_file/multispecific , the resulting video shows weird white masks on every recognized human faces. 有人能帮忙吗? Could someone helps? weird_mask

图片 可能提供一份:vggface2_crop_arcfacealign_224.tar 吗?

ChoYongchae commented 3 months ago

I solved it by referring to the answer (https://github.com/neuralchen/SimSwap/issues/261). To summarize the issue, the current train.py and test code do not match properly, so the test code needs to be modified.

By using the same model initialization as train.py and modifying normalize/denormalize, we can obtain normal results without brightened face. Please see the fixes below.

1. Model initialization https://github.com/neuralchen/SimSwap/blob/a5f6dea67398eec9ee71e156f7ad15dbd7ce4977/predict.py#L56

        from models.projected_model import fsModel  
        model = fsModel()  
        model.initialize(opt)

2. Norm./De-norm. https://github.com/neuralchen/SimSwap/blob/a5f6dea67398eec9ee71e156f7ad15dbd7ce4977/predict.py#L88-L90

        imagenet_std    = torch.Tensor([0.229, 0.224, 0.225]).view(3,1,1).cuda()
        imagenet_mean   = torch.Tensor([0.485, 0.456, 0.406]).view(3,1,1).cuda()

        b_align_crop_tenor.sub_(imagenet_mean).div_(imagenet_std)
        swap_result = model.netG(b_align_crop_tenor, latend_id)[0]
        swap_result.mul_(imagenet_std).add_(imagenet_mean)