AAnoosheh / ToDayGAN

http://arxiv.org/abs/1809.09767
BSD 2-Clause "Simplified" License
172 stars 32 forks source link

confused about the code in the discriminator #17

Closed HaoL20 closed 4 years ago

HaoL20 commented 4 years ago

Hi, @AAnoosheh , https://github.com/AAnoosheh/ToDayGAN/blob/master/models/networks.py#L265 for ndf * nf_mult + 1 ,I can't understand why the out_channels should be increased by 1

AAnoosheh commented 4 years ago

I custom-modified the discriminator to give me a classification decision at each layer (a "multi-scale discriminator," as I call it in the paper.) This is an extra channel that does not go into the next layer, but into the loss function instead.

HaoL20 commented 4 years ago

Thank you for your patience answer.I got it! But there are still two questions:

  1. In Line 255 and Line 458 It seems that the last channel of the first conv layer is not only used for prediction but also for training loss function. Shouldn't all predictions for loss function be one channel?

  2. I test the gkern_2d code, and I find that the generated Gaussian kernel does not seem to be a standard Gaussian kernel, so I try to modify the code like this: truncate = (((size - 1)/2)-0.5)/sigma mask = gaussian_filter(dirac, sigma, truncate =truncate ) Maybe it will work well!

AAnoosheh commented 4 years ago

Ah yes I just remembered I don't use the output of the first conv since it's probably not well-processed enough to make a "decision" about anything.

Regarding the gaussian, I guess it doesnt really matter, since the RGB discriminator doesn't rely on any intrinsic principles of a true gaussian. It just needs to see a much lower-detail version of the image in pretty much any differentiable way possible.

HaoL20 commented 4 years ago

Thank you very much!