GANs-in-Action / gans-in-action

Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks
1.02k stars 422 forks source link

chapter4 DCGAN used tf.keras but could not produce same results #20

Open Nevermetyou65 opened 3 years ago

Nevermetyou65 commented 3 years ago

Hi, I am reading the chapter 4 of this book and there seem to be some problem. The code written in this book is from Keras but when I do the code I prefer to use tf.keras which should not be different. I implemented the code written in chapter 4 using tf.keras and I got strange result like the loss of discriminator and generator approached 0 and the acc. to 1. Also the result in image-grid is just noise. But when I removed the BathcNormalization layer out of both generator and discriminator, I got fine fake digit images. Any Idea why????

this is the colab containing the code https://colab.research.google.com/drive/1TF-nkPPkj0HAzKceb3UL_AzSdb-0DjKD?usp=sharing

mjzalewski commented 2 years ago

I also had the same problem. When I removed all the BatchNormalization layers from both the discriminator and generator, the problem was even worse (The loss for the generator was high, and the images were just blobs).

I had success when I removed the BatchNormalization from the discriminator only.

I suspect that the Discriminator is training too quickly relative to the generator. When you remove BatchNormalization from the discriminator, it trains more slowly, closer to the training rate of the generator.

bladebump commented 2 years ago

I also had the same problem. I change this model to use maxpool ansd remove batchnormalization.it's work.but not good.

marckolak commented 1 year ago

I had a same problem. I removed BatchNormalization and tahn activation layer. I also added a Dropout layer in the discriminator to avoid overfitting. Here you have modified models for reference

def build_generator(img_shape, z_dim):
    model = Sequential()
    model.add(Input(shape=z_dim))

    model.add(Dense(256*7*7))
    model.add(Reshape((7,7,256)))

    model.add(Conv2DTranspose(128, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))

    model.add(Conv2DTranspose(64, kernel_size=3, strides=1, padding='same'))
    model.add(LeakyReLU(alpha=0.01))

    model.add(Conv2DTranspose(1, kernel_size=3, strides=2, padding='same'))

    return model

def build_discriminator(img_shape):
    model = Sequential()
    model.add(Input(shape = img_shape))

    model.add(Conv2D(32, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))

    model.add(Conv2D(64, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))

    model.add(Conv2D(128, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    model.add(Dropout(0.4))

    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))

    return model

The results are much better.

Chiuchiyin commented 2 weeks ago

It kind of work after the changes marckolak suggested. I probably trained the model too long and ended with mode collapse.

image