justinpinkney / toonify

605 stars 73 forks source link

about inference performance question #1

Open Johnson-yue opened 3 years ago

Johnson-yue commented 3 years ago

Hi, I followed your twitter and blog,

  1. finetune a stylegan2 model on cartoon dataset and got cartoon-model
  2. blend-models
  3. generated a lot of pair data which contain ffhq_model(fake human face) vs cartoon_model image 3k
  4. training pix2pixhd model

My problem is I trained pix3pix using fake face(from ffhq_model),but it is NOT work using real human face as input I have test the another fake face as input, it work well, the proformance is good, but when I test it by using real human face output is very bad.

Did you have some problem?

justinpinkney commented 3 years ago

Two questions:

  1. What parameters did you use for training pix2pixhd?
  2. Did you align the real face images before applying your trained model?
Norod commented 3 years ago

"I have test the another fake face as input, it work well, the proformance is good, but when I test it by using real human face output is very bad." - Sounds like you are indeed not properly cropping and aligning the input human face. Also 3K image pairs (6000 images total) is a very low number.

Johnson-yue commented 3 years ago

@justinpinkney
1) I train the pix2pixhd is default parameter。 2) real face images is aligned when I test the trained model which trained on fake face dataset

Johnson-yue commented 3 years ago

@Norod how many fake image pairs you use when trained pix2pixhd?

Norod commented 3 years ago

@Johnson-yue I had 18,000 Pairs (36,000 images total)

Johnson-yue commented 3 years ago

@Norod Woo, So many image pairs Did you train with the fake image pair and test with real human face image??

According to my experience:

  1. The effect of clear pictures is better than that of blurred pictures.

  2. The generated fake face is usually very clear, but the real face input is often not so clear, and even some noise, resulting in the conversion effect is very bad , I do not know if you have encountered such a problem

youjin-c commented 2 years ago

Hello,

I am trying to understand the pipeline.

when training pix2pixHD do I need to segment face features(like eyes nose mouth and eyebrows of both input and output imgs)?? Or just one on one training works? Or do we leverage the pre-trained label-to-face model?

justinpinkney commented 2 years ago

Just train pix2pix from real image to toonified image. No need for any segmentations or labels, the original StyleGAN generations should be the input, and the fine-tuned stylegan generations (with the same latent code) should be the output.

youjin-c commented 2 years ago

Thank you @justinpinkney ! Let me try training. I am so excited to see the result!

youjin-c commented 1 year ago

Hello, @justinpinkney I also found you and Doron Adler contributed to pix2style2pix. May I ask which one I should use if I want to reproduce with my dataset?

goldwater668 commented 1 year ago

Hello, let me know, what is used to generate the real face and cartoon face data pair?

你好,我注意到你的推荐和博客,

  1. 在卡通数据集上微调一个stylegan2模型并得到卡通模型
  2. 混合模型
  3. 生成了大量包含ffhq_model(假人脸)与cartoon_model图像3k的配比数据
  4. 培训 pix2pixhd 模型

我的问题是我用假人脸训练了pix3pix(来自ffhq_model),但是用真人脸做为 输入是行不通的用真人脸输出是非常糊糕的。

你有什么问题吗?

Hello, let me know what is used to generate the real face and cartoon face data pair