neuralchen / SimSwap

An arbitrary face-swapping framework on images and videos with one single trained model!
Other
4.5k stars 891 forks source link

I'm curious about the data and architecture of the pretrained Arcface model you used. #270

Open kyugorithm opened 2 years ago

kyugorithm commented 2 years ago

Hello. First of all, thank you for your wonderful results. I have a question about the pretrained arcFace model.

Consider when arcFace is used as an ID embedder. Depending on the trained data, training for swapping may not have been done well for a particular mode (race, gender, age), which is expected to have disproportionate conversion performance depending on race, gender, and age.

In fact, compared to the results of face swapping for Western samples, the results of face swapping for Asian samples seem to be very low in quality.(or there is a problem that even if I use Asian source image as input, the result is made like Western people)

To solve this problem, I would like to replace the model with the recently learned arcface.

It would be a great appreciation for me if you could tell me which data and which model architecture (Resnet 50, 100 etc.) were used to learn the model. For example, following github page shows arcFace performance for various conditions https://github.com/deepinsight/insightface/tree/master/recognition/arcface_torch#ms1mv3

Thank you for reading and I look forward to your cool answer!

neuralchen commented 2 years ago

Thank you for your very suggestive issue. Can you be sure that these issues are related to arcface's model? In fact, most of the data in VGGFace2 are Western faces, and less than 10% of Asian faces

Fibonacci134 commented 2 years ago

Hey, im not sure if I understand your question; but you can always replace the detection and recognition model to a different one as long as they are compatible. As far as the Asian/Western face : it might be a visual bias. Certain faces with more prominent facial features which are more angular usually make for better swaps. The way the discriminator is trained is not bias to one or another, just a similarity quota. Certain recognition models can detect certain features with a higher rate of accuracy, but those can be easily downloaded from outside sources. Perhaps what you should try, is to find a more distinct and variant picture of the source and reassess the results. Simswap sacrifices a bit of identity similarities in exchange from for facial emotional variance, you will see what I mean if you compare the results to a face shifter model or something else. Good luck, apologies if my answer did not satisfy your question. Be well.

usmancheema89 commented 2 years ago

Hello, @neuralchen, @kyugorithm can you let me know which FR model was used for ID retrieval? If I want to replace or retrain the model then what are the restrictions on the output of the model?

kyugorithm commented 2 years ago

@usmancheema89 I'm sorry for the late reply.

There are no special restrictions. Instead of the model previously provided by the authors, we used the model deployed in the path below the insightface. (You can check the instructions in the repo.) https://onedrive.live.com/?cid=4a83b6b633b029cc&id=4A83B6B633B029CC%215577&authkey=!AFZjr283nwZHqbA

We also used models learned directly from the WebFace42M dataset.

usmancheema89 commented 2 years ago

@kyugorithm Thanks :)