AntixK / PyTorch-VAE

A Collection of Variational Autoencoders (VAE) in PyTorch.
Apache License 2.0
6.44k stars 1.05k forks source link

Training VQVAE dose not convergence. #71

Open henanjun opened 1 year ago

henanjun commented 1 year ago
blade-prayer commented 1 year ago

I found that in vq_vae.yaml, the scheduler_gamma is set to be 0.0. This parameter controls the multiplicative factor in torch.optim.lr_scheduler.ExponentialLR, which makes the learning rate to be 0 after epoch 0. Do you think this is the reason?

imskull commented 1 year ago

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

xjtupanda commented 1 year ago

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

Met the same problem. I found the loss always unreasonably high (~1.0e+6) and it might cause a gradient explosion. This one helps, thanks a lot.

ohhh-yang commented 5 months ago

In my training process the loss is more unreasonable(up to 1.0e+26!). And it just fluctuates like a pendulum. This one really works, you are my god!!!

Changing "LR"(learning rate" from 0.005 to 0.001 helps.