SKTBrain / DiscoGAN

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"
773 stars 170 forks source link

Unnecessary computation in backward pass #2

Open ppwwyyxx opened 7 years ago

ppwwyyxx commented 7 years ago

In the code of WassersteinGAN, they have this line:

        for p in netD.parameters():
            p.requires_grad = False # to avoid computation

I think it means that when you train G, by default you'll compute gradients for D as well (but not updating them), and vice versa. Setting the flag to False to avoid the computation should speed up the training a lot. I found that my tensorflow implementation runs much faster than this code, and this is probably the reason.

eriche2016 commented 7 years ago

it is still very slow when turnning this flag off, donot know the reason.

eriche2016 commented 7 years ago

I find that training this net on GPU for 1 iterations ,with batch size 64 on celeba , costs me nearly 30 seconds.

ppwwyyxx commented 7 years ago

...That's probably your setup problem. It should be on the order of 0.4 seconds on a good GPU.

eriche2016 commented 7 years ago

Hi, I use the default setup of the parameters. And my GPU has a memory of 12G. I donot know the reason why it is too slow..

jazzsaxmafia commented 7 years ago

Thank you for the tip! We were not aware of that.