GAN example doesn't converge

MalchuL commented 4 years ago

Describe the bug On master (according catalyst==20.4.1) (https://github.com/catalyst-team/catalyst/commit/abfb121640e6934c99210e9c0f402af5338200c8) GAN example can't converge. I run next script catalyst-dl run -C examples/mnist/configs/vanilla_gan.yml. After 100 epoches I had next loss values: loss_d=0.5493 | loss_d_fake=1.0986 | loss_d_real=1.526e-06 | loss_g=0.4054

Expected behavior On old version (20.2) in this repo (https://github.com/catalyst-team/gan) I got next losses | loss_d_fake < 0.5 | loss_d_real < 0.5 |

Additional context I found that behavior starts since catalyst 20.3 version

Screenshots After training on catalyst==20.4.1 in tensorboard I got this images Screenshot from 2020-04-19 20-44-51

asmekal commented 4 years ago

Hi, thank you for reporting. This is indeed a bug as none of the models in simple example is actually converging... I will try to invertigate and fix that in next few days

MalchuL commented 4 years ago

Any updates?

catalyst-team / gan

GAN example doesn't converge #2