Open WhiteFu opened 5 years ago
I am facing the same issue. Please let me know if you have resolved it.
actually, it is a common occurrence when dealing with a variational autoencoder. Two way to resolve it 1) again start training from 3 or 4 back saved checkpoints (not from recent one). But be prepared loss may explode again after running for a while, then do the same process again. 2) In the file train.py on line 133, change the value of the loss. @adimukewar @WhiteFu
Thanks for your reply, I will take it immediately!
I am facing the same issue. Please let me know if you have resolved it.
sorry, I didn't reply to you in time. I have been trying some other work recently, so I haven't solved this problem
@WhiteFu if you are using this code then use large (more than 50 hours) expressive dataset like a blizzard for getting a decent result.
I am facing the same issue. Please let me know if you have resolved it.
sorry, I didn't reply to you in time. I have been trying some other work recently, so I haven't solved this problem
hi, I have the same problem that I supposed to modified some hparams but it still not work.Please let me know if you have solved this. thx😄
The loss is not stable, so you can modify the upper limit of the parameter In the file train.py on line 133,
The loss is not stable, so you can modify the upper limit of the parameter In the file train.py on line 133,
hi, but it seems my loss = nan (every time at the same step when training) and I try to modify the batch size or learning rate but it still not work.
@MisakaMikoto96 aware of Nan
loss, it means your variational autoencoder (VAE) unable to learn the latent representation. This is the common problem when you dealing with Variational autoencoder but the sad part is, there is no simple solution for that.
One solution you can try to go line and manipulate w1
and w2
.
But before that make sure you have adequate quantity and expressiveness rich voice data and also sometimes after getting error, restart training from 2 steps back saved checkpoint is worked fine for me, if you getting the error again at the same checkpoint then restart from 3 steps back saved checkpoint and so on. If again and again, you get Nan
at the same step count then try the above solution.
You can also read variational autoencoder paper for more understanding and otherwise feel free to ask here.
I get the error "loss explode" in the training stage! I'm not modifying the original hyperparameters, and I want to know how to solve the problem.