oneThousand1000 / HairMapper

(CVPR 2022) HairMapper: Removing Hair from Portraits Using GANs.
290 stars 39 forks source link

What is D_noise for? #3

Closed rainsoulsrx closed 2 years ago

rainsoulsrx commented 2 years ago

Thank you for your excellent work. I wonder what the D_0 and D_noise for, why you generate two part of random images?

oneThousand1000 commented 2 years ago

Hi rainsoulsrx, please refer to Sec 3.2 in the paper.

The e4e encoder can map a real image into its latent code w+, w+ consists of a series of latent vectors w with low variance, each close to the distribution of the W latent space of StyleGAN. In order to apply our method to latent codes of real images using e4e, we sample two w+ latent code set: D0 (in W latent space) and Dnoise (add noise to codes in each layer, close to the distribution of the W latent space).

rainsoulsrx commented 2 years ago

Thanks for you quick reply. You mean you train on both D0 and Dnoise separately, in order to make the model robust in both stylegan fake images and real images?

rainsoulsrx commented 2 years ago

And I run the code, and found some failures like follows:

image

Is this right?

oneThousand1000 commented 2 years ago

Thanks for you quick reply. You mean you train on both D0 and Dnoise separately, in order to make the model robust in both stylegan fake images and real images?

Yes, you are right.

oneThousand1000 commented 2 years ago

And I run the code, and found some failures like follows: image Is this right?

The failures are acceptable. During training, some images' hair can not be removed completely (about 5%) by the hair separation boundary, that's also the reason why we use the paired latent codes to train a hair mapper (the failures won't affect the training of hair mapper).

oneThousand1000 commented 2 years ago

There are also some images in my own training dataset that still have hair, they didn't affect the training of the hair mapper. img

rainsoulsrx commented 2 years ago

There are also some images in my own training dataset that still have hair, they didn't affect the training of the hair mapper. img

Got it!! Thank you for your kind and detailed reply~~

rainsoulsrx commented 2 years ago

Hi, I have another question. After you get the edited image, you optimize the result using L1 and VGG loss, when using L1 loss, you multi hair mask, I think it is reasonable. But when calculating VGG loss, you do not multi the mask, why?

image
oneThousand1000 commented 2 years ago

Hi, I have another question. After you get the edited image, you optimize the result using L1 and VGG loss, when using L1 loss, you multi hair mask, I think it is reasonable. But when calculating VGG loss, you do not multi the mask, why? image

We use vgg loss (perceptual loss) to penalize the high-level (geometric) feature difference between x and x_rec, we want the x_rec to maintain the geometric feature of the bald head and face in x (the edited image). Please refer to Sec 3.4 in Coarse-to-Fine: Facial Structure Editing of Portrait Images via Latent Space Classifications, we explained the details of diffusion.

rainsoulsrx commented 2 years ago

Oh oh oh, I git it, thanks!!!