Open ariel415el opened 1 year ago
Actually the same. Since we wrote our paper from the vallina GAN objective, for the vallina GAN, the output of discriminator is [0, 1], the probablity, and we add the -0.5 in the paper for consistency. In the implementation, we follow the StyleGAN-ADA and the discriminator outputs the logits, which could be negative, hence we use
@Zhendong-Wang Hello again, and I have a few questions here:
(1) If to use r_d
with LSGAN loss, that is, the discriminator D will learn to regress to 0.0 ~ 1.0, without the last activation layer (just logit):
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.mynet = MyNetwork() # has NO last activation layer such as sigmoid
def forward(self, img, t):
batch_size, H, W = img.size(0), img.size(-2), img.size(-1)
t = torch.ones(batch_size, 1, H, W) * t.view(-1, 1, 1, 1)
x = torch.cat((img, t), dim=1) # Is this correct?
x = self.mynet(x)
return x # logit, without the last activation layer
loss = nn.MSELoss()
# ... extra code omitted
real_diffused, t = diffusion(real)
d_real = discriminator(real_diffused, t)
# ... extra code omitted
d_loss_real = loss(d_real, torch.ones_like(d_real))
d_loss_fake = loss(d_fake, torch.zeros_like(d_fake))
r_d = (d_real.detach() - 0.5).sign().mean()
In this case, r_d = (d_real.detach() - 0.5).sign().mean()
is correct?
(2) In the above code block, intoducing the time step t
as concat, just as the condition concat, is okay?
Or, a trainable structure, such as a fully-connected linear layer or single convolutional layer would be required?
Thank you !! :-)
(1) If you use sigmoid as the final activation function to rescale the output to [0, 1], then you should use r_d = (d_real.detach() - 0.5).sign().mean()
.
(2) we use concat and it works in our case, but I think better architecture design could benefit more here.
Hi, In the paper you used the following formula for the adjusting variable rd
But in the StyleGAN-ADA paper you're refereeing to the formula is
Did you indeed refer to rt in StyleGAN-ADA
Substituting 0.5 from the discriminator usually makes no sense as the values are no between [0,1] as shown in the StyleGAN-ADA paper they are simply symmetrical around 0.
Am I missing something?