pix2pix training discriminator saturation

I have a similar problem with a custom dataset (~100,000 training examples). The discriminator maxes out during the first epoch. The paper states: "As suggested in the original GAN paper, rather than training G to minimize log(1 − D(x, G(x, z)), we instead train to maximize log D(x, G(x, z)) [24]. In addition, we divide the objective by 2 while optimizing D, which slows down the rate at which D learns relative to G."

I'm a relative beginner to Keras and GANs but the code doesn't look like it is quite reproducing exactly the above. (I believe the code is set up for optimal clarity and educational value, rather than practical performance.) For example, I'm not sure where the 'dividing the objective by 2' is implemented (seems like it isn't), but it looks like we need this to slow down the discriminator. I might try to write a custom loss function that gives 1/2*mse rather than mse - not sure if this is what the authors mean. Edit: I found a comment by one of the authors on the cycleGAN repo that somewhat explains this factor of 0.5 https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/242 I'm not sure if there are other valid ways to weaken the discriminator, such as adding dropout. Any experts able to chip in? Many thanks. Amazing repo btw. Thanks so much Erik.

eriklindernoren / Keras-GAN

pix2pix training discriminator saturation #62