Closed bcpenggh closed 6 years ago
Sorry that I don't know how to attach the file. Let past part of the code below
noise.resize_(batch_size, nz, 1, 1).normal_(0, 1)
noisev = Variable(noise)
fake = netG(noisev)
labelv = Variable(label.fill_(fake_label))
output = netD(fake.detach()) ########### Line 241
errD_fake = criterion(output, labelv)
errD_fake.backward() ################# Line 243
D_G_z1 = output.data.mean()
errD = errD_real + errD_fake
optimizerD.step()
In other words, if the discriminator is detached" while using fake photos and the gradient is calculated using real photos in line 233, why do we still call "backward" in line 243?
netD.zero_grad()
real_cpu, _ = data
batch_size = real_cpu.size(0)
if opt.cuda:
real_cpu = real_cpu.cuda()
input.resize_as_(real_cpu).copy_(real_cpu)
label.resize_(batch_size).fill_(real_label)
inputv = Variable(input)
labelv = Variable(label)
output = netD(inputv)
errD_real = criterion(output, labelv)
errD_real.backward() ############### Line 233
D_x = output.data.mean()
Hi @bcpenggh ,
fake.detach() means you remove the generator from the computation graph, not the discriminator. That is, the discriminator still gets gradients from the backward().
I am not sure if I get your points, do I answer your questions?
Thanks
Dear Hui-Po, Do you mean line 243 "output = netD(fake.detach())" actually detachs the model "netG" who generates "fake" in line 239 "fake = netG(noisev)", instead of "netD" who use "fake" to generate "output" in line 243?
Yes.
Thanks for your help
Dear TA: According to this question, why we're not detach Discriminator when we training Generator.
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
labelv = Variable(label.fill_(real_label)) # fake labels are real for generator cost
output = netD(fake) ##no detach
errG = criterion(output, labelv)
errG.backward()
D_G_z2 = output.data.mean()
optimizerG.step()
Dear TA,
In the example code, "detach" is used to call discriminator using fake photos. See line 241 in the attached file. As I know this disable the back propagation of the discriminator. If that is correct, why we still call "backward" in line 243? And why don't we train the discriminator using fake photos? Thanks for your reply.
Regards, BC