Need help in fixing a bug in the adaptation of the test code for the new method of training

netrunner-exe commented 2 years ago

Need help in fixing a bug in the adaptation of the test code for the new method of training. I almost managed to adapt the test code, but I ran into one bug that I'm really stuck on solving. Only this problem separates me from further adapting the test code for video, etc. The problem is the yellow and purple stripes that appear on the modified face after processing. I also made a fork of the repository where you can see all the changes and Colab for the full test. As a test I took the 512 390000 it previously published by @mittalgovind.

Link to Colab and fork with changes https://github.com/netrunner-exe/SimSwap

Unfortunately, I have not received any response from the developers. I would be very grateful for any help. If you can really help solve this problem, you can email me at netrunner.exe@gmail.com or in this thread.

Inference code:

# -*- coding: utf-8 -*-
# @Author: netrunner-exe
# @Date:   2022-07-01 13:45:41
# @Last Modified by:   netrunner-exe
# @Last Modified time: 2022-07-08 13:33:01
import numpy as np
import torch
from PIL import Image
from torchvision import transforms

from util.util import tensor2im

def _totensor(array):
    tensor = torch.from_numpy(array)
    img = tensor.transpose(0, 1).transpose(0, 2).contiguous()
    return img.float().div(255)

def swap_result_new_model(face_align_crop, model, latend_id):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    img_align_crop = tensor2im(face_align_crop[0], imtype=np.uint8, normalize=False)
    img_align_crop = Image.fromarray(img_align_crop)

    img_tensor = transforms.ToTensor()(img_align_crop)
    img_tensor = img_tensor.view(-1, 3, img_align_crop.size[0], img_align_crop.size[1])

    mean = torch.tensor([0.485, 0.456, 0.406]).to(device).view(1, 3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).to(device).view(1, 3, 1, 1)

    img_tensor = img_tensor.to(device, non_blocking=True)
    img_tensor = img_tensor.sub_(mean).div_(std)

    imagenet_std = torch.Tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    imagenet_mean = torch.Tensor([0.485, 0.456, 0.406]).view(3, 1, 1)

    swap_res = model.netG(img_tensor, latend_id).cpu()
    swap_res = (swap_res * imagenet_std + imagenet_mean).numpy()
    swap_res = swap_res.squeeze(0).transpose((1, 2, 0))

    swap_result = np.clip(255 * swap_res, 0, 255)
    swap_result = _totensor(swap_result).to(device)

    return swap_result

Снимок экрана 2022-07-08 210953 face_PNG

result_whole_swapsingle(2)

netrunner-exe commented 2 years ago

This is just a hunch, but I think you have two options.

First, it's very possible that you may have to reconfigure the down and upsample through the networks. The lines are found here, in order (seemingly) of operation. If you mess with the apply style (just add some multiplication on one of the tensor slices), you'll see that you might find the correct path.

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L86

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L139

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L41

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L25

Personally since the color issue isn't that big of a deal, I would just add a small, post process denoising or blur pass using OpenCV.

Hope that helps!

Hi! @ExponentialML! Looks like you're right. I thought that this is due to post-processing or pre-processing, but the input is without any distortion, but the result is obtained with these lines.

input: tensor_before_swapping

after swapping: tensor_after_swapping I really would not like to use OpenCV in order to remove these artifacts and initially get a normal result at the output. Unfortunately, this effect is very clearly visible on PNG, which precludes the use of PNG to create uncompressed final frames for video. Unfortunately, my level of knowledge is not enough to solve this problem on my own. The maximum that I managed to bring the code to this result. Do you have any code fixes to solve this problem? I don't want to give up this idea, but I can't solve it myself at this stage.

tamirgold commented 2 years ago

@netrunner-exe , Hi , here are my pre-trained model files - (512-VGGFaceHQ, 1,060,000 iters), i have trained the model using the supplied train.py for two weeks on A100 80GB (batch of 32) with GDeep True and now i can't seem to make it work. i just need the single image swap for now , would love it if you can make it work or guide how to resolve it https://drive.google.com/drive/folders/1wt9CzyHpNCEvXayUA-2Bx9Xg5qg17rW5?usp=sharing

netrunner-exe commented 2 years ago

@netrunner-exe , Hi , here are my pre-trained model files - (512-VGGFaceHQ, 1,060,000 iters), i have trained the model using the supplied train.py for two weeks on A100 80GB (batch of 32) with GDeep True and now i can't seem to make it work. i just need the single image swap for now , would love it if you can make it work or guide how to resolve it https://drive.google.com/drive/folders/1wt9CzyHpNCEvXayUA-2Bx9Xg5qg17rW5?usp=sharing

Hi @tamirgold! Thank you so much for sharing your model! Unfortunately, I have not yet been able to solve the problem with the stripes in the image. Have not tested your model yet. The developers recommended to train the model not more than 800.000 it., if you have saved the earlier training results - you can also try them to compare the results. I made a fork https://github.com/netrunner-exe/SimSwap for experimentation, try it with your model!

For simplicity, create a 512_new folder in the checkpoints folder and copy 1060000_net_G.pth into it. Then rename it to latest_net_G.pth.

Then everything is almost the same as in SimSwap only add parameter --new_model True to the command and remove from the command --name people.

Example: python test_video_swapsingle.py --new_model True --crop_size 512 --use_mask --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/multi_people_1080p.mp4 --output_path ./output/multi_test_swapsingle.mp4 --temp_path ./temp_results

or

python test_wholeimage_swapsingle.py --new_model True --crop_size 512 --use_mask --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --pic_b_path ./demo_file/multi_people.jpg --output_path ./output

If you need to use public beta 512, just change --new_model to False and add --name 512 To work with the 224 public model, you don't need to change anything except --crop_size 224.

netrunner-exe commented 2 years ago

I tested your model and it looks like it is overtrained. If you have any early checkpoints, please share them.

tamirgold commented 2 years ago

I tested your model and it looks like it is overtrained. If you have any early checkpoints, please share them.

@netrunner-exe i have uploaded 750k , 800k , 850k and 900k to the same folder, let me know if it helps https://drive.google.com/drive/folders/1wt9CzyHpNCEvXayUA-2Bx9Xg5qg17rW5?usp=sharing

tamirgold commented 2 years ago

result_whole_swapsingle

pic a is 1.jpg and pic b is 2.jpg, any suggestions on why am I getting these results?

netrunner-exe commented 2 years ago

I tested your model and it looks like it is overtrained. If you have any early checkpoints, please share them.

@netrunner-exe i have uploaded 750k , 800k , 850k and 900k to the same folder, let me know if it helps https://drive.google.com/drive/folders/1wt9CzyHpNCEvXayUA-2Bx9Xg5qg17rW5?usp=sharing

750000 doesn't work either. It seems that working checkpoints have to be looked for among even earlier ones

tamirgold commented 2 years ago

@tamirgold

This is a good case of overfitting. What it generally means in simple terms, is that when you train a model for to long, either the loss decay goes negative, or the discriminator starts depending so much on the training data that it only trusts the training data, making the generator only spit out the training images. It's an expensive mistake to make (trust me, I know :-) ).

The good thing is that since you have multiple saved models from training, work your way from the bottom up until you find the one the best one (so start at the lowest saved epoch eg. [number]net[type]).

It's a bit tricky because the training is based off of pix2pixHD, and if you scale up the batch size to a large amount, you may also have to tune the losses, but in this case you should be fine.

@ExponentialML and @netrunner-exe , thanks , i am starting to understand (... sometimes i win and sometimes i learn :) ) i did transfer learning on the 550k and for some reason, i didn't save models prior to 680k here are the results of 550k , 660, 680 and 700 , i can definitely see the overfitting, i did train on a larger batch then was at the example (i think it was 30 images per batch) so maybe had to tune the losses but didn't looked as deep as that i do compare my results to a commercial website called icon 8 and they seem to have much better results , not sure what model they are using here is an example with all my models compare to icon8

pic b pic a

550k - original model 550 660k 660 680K 680 700K 700 icon8 icon8

netrunner-exe commented 2 years ago

This is just a hunch, but I think you have two options.

First, it's very possible that you may have to reconfigure the down and upsample through the networks. The lines are found here, in order (seemingly) of operation. If you mess with the apply style (just add some multiplication on one of the tensor slices), you'll see that you might find the correct path.

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L86

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L139

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L41

https://github.com/neuralchen/SimSwap/blob/dd1ecdd2a718636d33977ab3097a69a0ecf080d8/models/fs_networks.py#L25

Personally since the color issue isn't that big of a deal, I would just add a small, post process denoising or blur pass using OpenCV.

Hope that helps!

Hi , @ExponentialML! I have a question - do you do any work on the problem with stripes in the image? Maybe you have a more specific solution? Thanks for your advice on solving this issue from the post above, unfortunately I was not able to solve it myself due to lack of knowledge.

ExponentialML commented 2 years ago

@netrunner-exe Hey. No, I haven't been able to work on the issue. I would still suggest post processing until you can find the solution.

vespersland commented 2 years ago

Hi,i get this error with your simswap mod version when i try to swap video with this line python test_video_swapsingle.py --new_model True --crop_size 512 --use_mask --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/multi_people_1080p.mp4 --output_path ./output/multi_test_swapsingle.mp4 --temp_path ./temp_results

test_wholeimage works well! thanks in advance,help me please

netrunner-exe commented 2 years ago

Hi,i get this error with your simswap mod version when i try to swap video with this line python test_video_swapsingle.py --new_model True --crop_size 512 --use_mask --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/multi_people_1080p.mp4 --output_path ./output/multi_test_swapsingle.mp4 --temp_path ./temp_results

test_wholeimage works well! thanks in advance,help me please

Hi! Fixed, now all should work fine

vespersland commented 2 years ago

i don't know why eyes direction always look to the right with some sources...any suggestion? My sources images are perfectly symmetrical (doesn't look on the side) thanks again!

SerZhyAle commented 2 years ago

cannot see Content in your fork folder, created manually

test = Image.fromarray(np.uint8(swap_result))
test.save("/content/simswap_img_result/face_JPG.jpg")
test.save("/content/simswap_img_result/face_PNG.png")

to me works only with "."
test.save("./content/simswap_img_result/face_JPG.jpg")
test.save("./content/simswap_img_result/face_PNG.png")

netrunner-exe commented 2 years ago

cannot see Content in your fork folder, created manually

test = Image.fromarray(np.uint8(swap_result))
test.save("/content/simswap_img_result/face_JPG.jpg")
test.save("/content/simswap_img_result/face_PNG.png")

to me works only with "."
test.save("./content/simswap_img_result/face_JPG.jpg")
test.save("./content/simswap_img_result/face_PNG.png")

Yes, these lines were needed to save the final result of the face to show an example of artifacts (lines) on the face and were originally used for Colab. It seems that no one is able to solve this problem, so there is no point in using them in the code. You can just comment them out instead of creating folders.

LAFLAMIE1024 commented 2 years ago

Hello there. First, I am sorry that I currently do not have any idea on this issue, but I have to say many thanks to your provided codes, which made my own trained 512 simswap work. Since I encountered the same problem few days ago : I trained the 512 model for a bunch of days, and it turned out that the test codes are incompatible with the new trained model, which drove me mad. I almost figure it out by myself, however, there are always some problems when I was trying to do the inference with the training code supplied and I could not get the result that make me satisfied. So I am here to say thank you.

BTW. If you have time, could you please tell me some details about how you do the research in this #292 issue? For example, I noticed that we could get the swapped face by feeding the target image and the latent id into the netG of the simswap. But as I save the image by cv2.imwrite('/content/test.jpg', swap_result), there is only black in the image. Could not see anything. So could you tell me what reminds you to do something like swap_result = np.clip(255*swap_res, 0, 255) swap_result = img2tensor(swap_result / 255., bgr2rgb=False, float32=True) in your swap_new_model.py , because I think this would help me find out why I am doing this wrong. THANKS in advanced!

andrissa commented 2 years ago

nice work net, but we have result model with the eyes direction on right. its because trained data ?

neuralchen / SimSwap

Need help in fixing a bug in the adaptation of the test code for the new method of training #295