otepencelik / GAN-Artwork-Generation

25 stars 7 forks source link

Function and difference of self.linear in cGAN model compared to CAN self.discriminate #3

Open sendeniz opened 3 years ago

sendeniz commented 3 years ago

Dear otepencelik,

Regarding your modelcGAN.py I was in particular curious about the self.linear part of the class Discriminator. In particular I am wondering about the function of the self.linear sequential container: self.linear = nn.Sequential( nn.Linear(ndf*16*4*2, ndf*16), nn.LeakyReLU(0.2, inplace=True), nn.Linear(ndf*16, 1), nn.Sigmoid())

From the forward pass: def forward(self, input, labels): disc_out = self.main(input) linear_input = torch.cat((self.label_emb(labels), disc_out), 1) linear_output = self.linear(linear_input.squeeze()) return linear_output.unsqueeze(2).unsqueeze(3)

I see that the output from the discriminator and the class embeddings are passed through self.linear. Does it return the class conditioned discriminations from the discriminator using a linear function ? I was in wondering about this due to the naming of this container as for example in the can_model.py a similar layer can be found which is named self.discriminate: self.discriminate = nn.Sequential( nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False), nn.Sigmoid())

, which however uses a Convolutional layer instead of a linear one. Can you comment on this difference in addition to my questions regarding the self.linear ? I would be happy to hear back from you and learn.

Kind regards

otepencelik commented 3 years ago

Hello sendeniz,

I uploaded our project paper to the repository in which you can find more detailed explanations about the architectures. From the project paper to answer your question:

"Initially, we implemented CGAN architecture using fully-connected layers and did not use any convolutional layers. Using fully-connected layers, our model suffered from the mode collapse problem that is common in GANs. Mode collapse occurs when the generator produces the same output or a small set of outputs repeatedly. To counteract this, we introduced a convolutional network architecture with batch normalization. Using a convolutional network architecture improved the quality of our images and we were able to start creating images in specific styles."

Hope this helps!