neuralchen / SimSwap

An arbitrary face-swapping framework on images and videos with one single trained model!
Other
4.53k stars 891 forks source link

Are there parameters for making the source facial details stronger? #14

Open ExponentialML opened 3 years ago

ExponentialML commented 3 years ago

First of all, great work. Are there parameters that can keep more of the details from the source image, or is this something that needs to be trained on? For example, sometimes there might be key details missing from the eyes, or maybe there are other features(piercings, tattoos, moles, and so on for example) that I may want to keep. Thanks!

instant-high commented 3 years ago

You can play around with the values for mask erode kernel and blur. In my local installation I've added a few more parameters like mask height and width, cut in and duration, detection-size (640 480 360 256) which make it much faster but depending of the size of the face in the full video frame. Also added preview of first swapped frame to decide wether to continue swapping or not....

instant-high commented 3 years ago

But to set all parameters I've insertet simswap to my FOM GUI

bmc84 commented 3 years ago

@bmc84 Could you share the picture which you found there is a very obvious line through the chin? It will help me to locate the problem, many thanks~.

Hello thank you for responding so quickly. Your project is amazing!! Thank you for everything you've released.

This isn't an example with the chin (the clip I first tested had a big beard, so that was probably causing the obvious line)... but this example shows that a bounding-box line is visible (not blurred) with masking. This example was using the ironman.jpg & use_mask = True.

lines

NNNNAI commented 3 years ago

@bmc84 could you send me the video and then I can test it by myself to find out how to fix it.My email is nicklau26@foxmail.com. Btw,can you share the command line you are using?

bmc84 commented 3 years ago

@bmc84 could you send me the video and then I can test it by myself to find out how to fix it.My email is nicklau26@foxmail.com. Btw,can you share the command line you are using?

Done :) I have emailed the .mp4 and the target.jpg

The command being used was

python test_video_swapspecific.py --use_mask --pic_specific_path ./demo_file/target.jpg --isTrain false --name people --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/officeSpace_test.mp4 --output_path ./output/officeSpace_output.mp4 --temp_path ./temp_results

Thank you

instant-high commented 3 years ago

Having a quick look at new reverse2original I see you've added part segmentation/face part id list. Does taht mean I can select which parts to swap just like in motion co-segmentation (part swap?) Would be the greatest thing so far... (Unfortunately I don' t can try right now)

NNNNAI commented 3 years ago

Having a quick look at new reverse2original I see you've added part segmentation/face part id list. Does taht mean I can select which parts to swap just like in motion co-segmentation (part swap?) Would be the greatest thing so far... (Unfortunately I don' t can try right now)

Yes ,you can do it.

NNNNAI commented 3 years ago

@bmc84 could you send me the video and then I can test it by myself to find out how to fix it.My email is nicklau26@foxmail.com. Btw,can you share the command line you are using?

Done :) I have emailed the .mp4 and the target.jpg

The command being used was

python test_video_swapspecific.py --use_mask --pic_specific_path ./demo_file/target.jpg --isTrain false --name people --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/officeSpace_test.mp4 --output_path ./output/officeSpace_output.mp4 --temp_path ./temp_results

Thank you

The problem have been solved , make sure you get the latest version of the code. Using mask should be better than not using mask now.

bmc84 commented 3 years ago

@bmc84 could you send me the video and then I can test it by myself to find out how to fix it.My email is nicklau26@foxmail.com. Btw,can you share the command line you are using?

Done :) I have emailed the .mp4 and the target.jpg The command being used was python test_video_swapspecific.py --use_mask --pic_specific_path ./demo_file/target.jpg --isTrain false --name people --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/officeSpace_test.mp4 --output_path ./output/officeSpace_output.mp4 --temp_path ./temp_results Thank you

The problem have been solved , make sure you get the latest version of the code. Using mask should be better than not using mask now.

It's definitely fixed! Looks fantastic now, great job :) Thank you.

NNNNAI commented 3 years ago

@bmc84 If you think this repo is helpful for you, please star it. Many thanks~.

instant-high commented 3 years ago

Last update 07/19/2021 I got this error _FileNotFoundError: [Errno 2] No such file or directory: './parsing_model/checkpoint\79999iter.pth'

I only replaced following scripts: ../options/base-options, train_options, test_options ../util/reverse2original, videoswap test_video_swapsingle and new folder parsing_model --> model and resnet.py

Or do I have to reinstall all files?

NNNNAI commented 3 years ago

Check it here preparation, download the file and place it in ./parsing_model/checkpoint

instant-high commented 3 years ago

Thank you, seems to work now. Maybe I haven't read that carefully ....

instant-high commented 3 years ago

New version results much better. Usung mask of course way slow, but doesn't matter...

'79999_iter.pth' just the same as in zllrunning faceparsing/facemakeup? So could change color of for eg. the hair?

NNNNAI commented 3 years ago

Honestly, I am not sure wheather the file from face-parsing.PyTorch is same as zllrunning faceparsing/facemakeup. But you can have a try, btw I have updated the reverse2original.py to fix a small bug, make sure you get the newest version.

instant-high commented 3 years ago

That was the small bug with the soft bounding box, only working when using mask, I've noticed that.?

Line 12 in reverse2original.py are the parts that are parsed using mask I think.

face_part_ids = [1, 2, 3, 4, 5, 6, 10, 12, 13] if no_neck else [1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14]

As I've figured out, every part in that line will be swapped. Is that correct?( I've made some short tests) Removing eg. number 4 and 5 will not swap the eyes I think. So this should be the complete part list:

atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']

1 face 2 left brow 3 right brow 4 left eye 5 right eye 6 eye g 7 left ear 8 right ear 9 ear r 10 nose 11 mouth 12 upper lip 13 lower lip 14 neck 15 neck l 16 cloth 17 hair 18 hat

ExponentialML commented 3 years ago

@instant-high

Yes, you are correct.

aesanchezgh commented 3 years ago
  1. An option to perform super resolution on the face during or after inference. An example would be something like Face Sparnet.

For info, I use an optional post-processing with GPEN [1] in one of my SimSwap-Colab notebooks: gpen/SimSwap_images.ipynbOpen In Colab

If you want to try GPEN on Colab, independently from SimSwap, I have a small Github Gist for that purpose: https://gist.github.com/woctezuma/ecfc9849fac9f8d99edd7c1d6d03758a

Based on my experiments with super-resolution, GPEN looks the best among open-source super-resolution algorithms for faces. When it works, it is great. However, one has to be careful, because it can introduce artifacts, especially if the input photos had already been processed with a super-resolution algorithm. So it really has to be optional, to avoid inadvertently stacking super-resolution processing steps, which would degrade the final result.

[1] Yang, Tao, et al. GAN Prior Embedded Network for Blind Face Restoration in the Wild. arXiv preprint arXiv:2105.06070. 2021. (code and paper)

Have you ever gotten GPEN to work on Windows? I am trying to replicate what you did locally, but running into this error: "Ninja is required to load C++ extensions" Ninja is installed, ugh

aesanchezgh commented 3 years ago

You can try to use this face parsing repo to get the mask, and blend the original image and swapping image according to the mask. The results by using mask or by bounding box are shown below. Many thansk~.

I've actually started training a model a few days ago using Self Correction for Human Parsing. It's very fast, and I believe it works without adding too many modules to SimSwap's repository.

The dataset I'm using is one that I've found here called LaPa, and would probably be best as it deals with occlusion, faces, and difficult situations.

img

The idea is that you would either mask it first for better alignment, or use it as a post processing step. There's another idea of using face inpainting after masking, but that's starting to get into third party plugin territory which might add complexity.

Would you like me to create a pull request when or if I get this implemented?

I am interested in this. Did you get it working?

aesanchezgh commented 3 years ago

After playing with GPEN and DFNet (GPEN gives much better quality results in my opinion based on some tests).

Like the suggested above, but I would like to focus on local machine (Windows 10 + Anaconda) It will be really nice and useful to have optional extra command to add: GPEN as post:

1️⃣ - GPEN should run on every frame that SimSwap generated. 2️⃣ - Then SimSwap merge-connect everything together hopefully with lossless quality from the source video.

I couldn't install GPEN locally so I could only test it via google colab, but if this could be merged to SimSwap as optional command so the user won't have to use GPEN (like the current version) but using a command to run GPEN as post with it's basic properties (scaling, generate ONLY the final GPEN without the other files to speed up things and focus SimSwap's goal) will be more than nice!

Did you fail to get GPEN working on Windows?

aesanchezgh commented 3 years ago

@instant-high Your work is great, do you mind I uploaded modified reverseoriginal.py in next simswap updated? I will make an acknowledgement for you in the "News" section.

Was this ever integrated?

instant-high commented 3 years ago

@cwalt2014 the soft boundingbox is integrated.

woctezuma commented 3 years ago

Have you ever gotten GPEN to work on Windows? I am trying to replicate what you did locally, but running into this error: "Ninja is required to load C++ extensions" Ninja is installed, ugh

Sorry, I have never tried that. I have always used Google Colab.

osushiski commented 3 years ago

This is just in my opinion. GPEN is often too artificial, I supposed. On the other hand, GFPGAN is more suitable. I was satisfied when I tried tg-bomze's repo below.

https://github.com/tg-bomze/collection-of-notebooks/blob/master/QuickFaceSwap.ipynb

Although it has some unresolved issues yet......

aesanchezgh commented 2 years ago

You can try to use this face parsing repo to get the mask, and blend the original image and swapping image according to the mask. The results by using mask or by bounding box are shown below. Many thansk~.

I've actually started training a model a few days ago using Self Correction for Human Parsing. It's very fast, and I believe it works without adding too many modules to SimSwap's repository.

The dataset I'm using is one that I've found here called LaPa, and would probably be best as it deals with occlusion, faces, and difficult situations.

img

The idea is that you would either mask it first for better alignment, or use it as a post processing step. There's another idea of using face inpainting after masking, but that's starting to get into third party plugin territory which might add complexity.

Would you like me to create a pull request when or if I get this implemented?

Hey what's your email? I would like to collaborate with you on some of my edits.

ea-evdokimov commented 2 years ago

hello everyone! first of all, thank you so much for your work. secondly, has anyone encountered the problem that the final face is superimposed on the hand or other object in front of the face? is there any way to improve/add segmentation? are there ready-made solutions?

instant-high commented 2 years ago

@ea-evdokimov https://github.com/neuralchen/SimSwap/issues/14#issuecomment-882652186 Try with removing 1,11,12,13 from the list in reverse2original.py

ea-evdokimov commented 2 years ago

Thanks. Is it possible to find, which areas the numbers correspond to?

instant-high commented 2 years ago

Look here https://github.com/neuralchen/SimSwap/issues/14#issuecomment-882652186

ea-evdokimov commented 2 years ago

Thanks one more time. but is there any way to automate the overlay process if there are hands or something else in front of the face? if you need these parts of the face to be transferred while nothing is blocking them? for example using face segmentation on the each frame?

rkhilnani9 commented 2 years ago

Cool, I will try to add these modules to see if there is any improvement.

Great. Another idea for a problem I've just realized. If the face is tilted too far, there are a lot of jitters on the swapped face. For example, if the head goes from a 0 degree angle to a 90 degree one, the facial alignment struggles a bit. To fix this, you can set a parameter by getting the rotation matrix from the top of the head to the bottom. It would be something like this in steps.

1. Detect if head is tilted to far.

2. If the head tilted to far, rotate then entire target video so that it is aligned, and the head's the chin at the bottom, and the top of the head at the top.

3. Re-detect the face after the rotation, and swap.

4. Rotate the entire target video back to its regular rotation.

5. Continue inference and repeat if needed.

That should solve that issue. If this was Dlib I could probably implement this, but I'm not familiar with what you're using :).

Hi @ftaker887 , Did you get around to solving this issue? I am also facing the problem where when the subject in the video looks down, or tilts their head at extreme angles, the mask becomes distorted and it is visibly apparent.

Any leads on this would be helpful. Thanks!

rkhilnani9 commented 2 years ago

@NNNNAI Thank you for all the great work!

I am observing a problem wherein when the subject in the video looks towards the right or the left, the mask distorts completely, displaying a very obvious line through the chin. Has this already been taken care of or is it still a WIP?

ExponentialML commented 2 years ago

@rkhilnani9 Hello. I haven't been able to get around to it, but the same rules should still apply. To get around this, either mess with the mask parameters or train a model using FFHQ alignment.

rkhilnani9 commented 2 years ago

UPDATE - The subject in the video was a bearded male, hence the distortions. It is working fine with a female subject. Thanks again for the great repo!

netrunner-exe commented 2 years ago

@rkhilnani9 Hello. I haven't been able to get around to it, but the same rules should still apply. To get around this, either mess with the mask parameters or train a model using FFHQ alignment.

Hello! I have a question - does it make sense to train model 224 with the --gdeep True parameter (if I understand correctly, this is FFHQ alignment)? Will this increase the quality and detail of the face with 224 trained model?

ExponentialML commented 2 years ago

@rkhilnani9 Hello. I haven't been able to get around to it, but the same rules should still apply. To get around this, either mess with the mask parameters or train a model using FFHQ alignment.

Hello! I have a question - does it make sense to train model 224 with the --gdeep True parameter (if I understand correctly, this is FFHQ alignment)? Will this increase the quality and detail of the face with 224 trained model?

I had convergence issues when setting Gdeep to True. For some reason, it wouldn't capture head rotations properly, although I don't know if adding this parameter adds a step or two to training, but I didn't feel like spending more money to wait and see :). Disabling it made everything work fine for me on an FFHQ VGGFace2 & CelebAMask aligned dataset, both on 512 and 256.

In theory, training on a dataset with full head alignment will allow for better results than the insightface crop method used before. The reason being is that the crop method used to train the old model results in the edge artifacts, and resulted in us needing a second solution for masking such as faceParsing.

My results so far are showing better convergence than how the public model was trained.

netrunner-exe commented 2 years ago

@rkhilnani9 Hello. I haven't been able to get around to it, but the same rules should still apply. To get around this, either mess with the mask parameters or train a model using FFHQ alignment.

Hello! I have a question - does it make sense to train model 224 with the --gdeep True parameter (if I understand correctly, this is FFHQ alignment)? Will this increase the quality and detail of the face with 224 trained model?

I had convergence issues when setting Gdeep to True. For some reason, it wouldn't capture head rotations properly, although I don't know if adding this parameter adds a step or two to training, but I didn't feel like spending more money to wait and see :). Disabling it made everything work fine for me on an FFHQ VGGFace2 & CelebAMask aligned dataset, both on 512 and 256.

In theory, training on a dataset with full head alignment will allow for better results than the insightface crop method used before. The reason being is that the crop method used to train the old model results in the edge artifacts, and resulted in us needing a second solution for masking such as faceParsing.

My results so far are showing better convergence than how the public model was trained.

I completely agree with you, so before I start training I try to learn how to do it right :) By your results do you mean the model that you trained with this published code or something from your own developments and solutions?

ExponentialML commented 2 years ago

I completely agree with you, so before I start training I try to learn how to do it right :) By your results do you mean the model that you trained with this published code or something from your own developments and solutions?

With the published code. Not finished training yet, but I expect it to be done in the next few days as I'm on a batch size of 60.

netrunner-exe commented 2 years ago

I completely agree with you, so before I start training I try to learn how to do it right :) By your results do you mean the model that you trained with this published code or something from your own developments and solutions?

With the published code. Not finished training yet, but I expect it to be done in the next few days as I'm on a batch size of 60.

I train model in Colab, the max batch size value with which I managed to start the training is 22 on Tesla T4 and 17 on K80. Please post your results after finishing training, very interested to see)

aesanchezgh commented 2 years ago

I completely agree with you, so before I start training I try to learn how to do it right :) By your results do you mean the model that you trained with this published code or something from your own developments and solutions?

With the published code. Not finished training yet, but I expect it to be done in the next few days as I'm on a batch size of 60.

Hey ftaker887, can you leave your email? I rewrote the SimSwap code and imporoved performance. I am making a generalized framework for multiple single shot models. Would love to collaborate with you!

Queen9516 commented 2 years ago

Recently I am working on simswap using pretrained model. The result is good and smooth. But I want to improve the result. How can I achieve this? Should I use transfer learning on existing model?

rongjc commented 1 year ago

@ftaker887 Wonderful~!!!!!!!!!!!!!!! But it will indeed add complexity to the original simswap repo by introducing self training mask model. How about this you create a individual repo named "simswap with more accuary mask" or something like that, and I will add a link refer to your repo in the simswap homepage when you get the function implemented.It all depends on you, looking forward to your feedback.

Sure, that seems like a good idea! It may take a bit of time since I have to make sure everything is neat and ready to use.

The model is almost ready (this is an early iteration). It's not perfect, but it works well when it wants to. Hopefully when the training is done it will be a bit better.

temp_masked temp_parsed

frame_0000221 temp_full_frame temp_masked temp_parsed

@ExponentialML I am encountering the same issue, do you have any insights on how you do it? I am planning to train Lapo on Bisenet, is that what you did? Thank you very much

bbecausereasonss commented 1 year ago

Has someone created a Windows gui or auto install .bat for this? :)