2017-fall-DL-training-program / VAE-GAN-and-VAE-GAN

An assignment to learn how to implement three differnent kinds of generative models
2 stars 0 forks source link

What is the effects of "detach?" #19

Closed bcpenggh closed 6 years ago

bcpenggh commented 6 years ago

Dear TA,

In the example code, "detach" is used to call discriminator using fake photos. See line 241 in the attached file. As I know this disable the back propagation of the discriminator. If that is correct, why we still call "backward" in line 243? And why don't we train the discriminator using fake photos? Thanks for your reply.

Regards, BC

bcpenggh commented 6 years ago

Sorry that I don't know how to attach the file. Let past part of the code below

train with fake

    noise.resize_(batch_size, nz, 1, 1).normal_(0, 1)
    noisev = Variable(noise)
    fake = netG(noisev)
    labelv = Variable(label.fill_(fake_label))
    output = netD(fake.detach()) ########### Line 241
    errD_fake = criterion(output, labelv)
    errD_fake.backward() ################# Line 243
    D_G_z1 = output.data.mean()
    errD = errD_real + errD_fake
    optimizerD.step()
bcpenggh commented 6 years ago

In other words, if the discriminator is detached" while using fake photos and the gradient is calculated using real photos in line 233, why do we still call "backward" in line 243?

train with real

    netD.zero_grad()
    real_cpu, _ = data
    batch_size = real_cpu.size(0)
    if opt.cuda:
        real_cpu = real_cpu.cuda()
    input.resize_as_(real_cpu).copy_(real_cpu)
    label.resize_(batch_size).fill_(real_label)
    inputv = Variable(input)
    labelv = Variable(label)

    output = netD(inputv)
    errD_real = criterion(output, labelv)
    errD_real.backward() ############### Line 233
    D_x = output.data.mean()
a514514772 commented 6 years ago

Hi @bcpenggh ,

fake.detach() means you remove the generator from the computation graph, not the discriminator. That is, the discriminator still gets gradients from the backward().

I am not sure if I get your points, do I answer your questions?

Thanks

bcpenggh commented 6 years ago

Dear Hui-Po, Do you mean line 243 "output = netD(fake.detach())" actually detachs the model "netG" who generates "fake" in line 239 "fake = netG(noisev)", instead of "netD" who use "fake" to generate "output" in line 243?

a514514772 commented 6 years ago

Yes.

bcpenggh commented 6 years ago

Thanks for your help

pandasfang commented 6 years ago

Dear TA: According to this question, why we're not detach Discriminator when we training Generator.

    ############################
    # (2) Update G network: maximize log(D(G(z)))
    ###########################
    netG.zero_grad()
    labelv = Variable(label.fill_(real_label))  # fake labels are real for generator cost
    output = netD(fake) ##no detach 
    errG = criterion(output, labelv)
    errG.backward()
    D_G_z2 = output.data.mean()
    optimizerG.step()