Closed Some-random closed 3 years ago
Same issue.. Have you figured out why this is happening? Also sync loss is stuck between 3-5 in evaluation.. So the syncnet weight is not getting used...
Same issue.. Have you figured out why this is happening? Also sync loss is stuck between 3-5 in evaluation.. So the syncnet weight is not getting used...
I loaded the pre-trained weights without GAN as an initializer so the sync loss falls below 0.75 quickly. The problem I'm having now is:
1 The discriminator loss goes up to 27 in a few epoch and never falls back, seems like the GAN training experienced some kind of problem 2 Even if checkpoints when the discriminator loss is normal are used for evaluation, I couldn't observe any quality improvement
If the discriminator loss or real/fake loss goes way up, stop the training and finetune from the last checkpoint where the model was stable.. that gave me sync loss near 1.00 and I didn't wait it to go beyond 0.75.. Manually changed sync_wt to 0.03 as suggested. Now the training is going well.. Hope it might help you too.. I don't know the exact reason behind this, but it helped.. I need to evaluate my models now. But still waiting to let the losses decrease a little more
@onzone I have the same issue. Did you get good reproduction results?
I am training in my own machine.. So the training speed is less..But with 40 epochs..the result looks like this..It seems it is going in the right direction.. currently L1: 0.030704378257730636, Sync: 0.34330415797721053, Percep: 0.7454155471909888 | Fake: 0.67623973247287, Real: 0.6748037663977385 @prajwalkr can you please let me know, what are some good values of these losses, where I know my model is getting converged..
https://user-images.githubusercontent.com/7061779/119449018-b1eb8380-bd4f-11eb-8e6b-63dfd3f7f098.mp4
@onzone Hello, I wonder about your training results!! Could you share the results?
I am training in my own machine.. So the training speed is less..But with 40 epochs..the result looks like this..It seems it is going in the right direction.. currently L1: 0.030704378257730636, Sync: 0.34330415797721053, Percep: 0.7454155471909888 | Fake: 0.67623973247287, Real: 0.6748037663977385 @prajwalkr can you please let me know, what are some good values of these losses, where I know my model is getting converged..
result_synced_merged_bazigar_part1_with_gan_resize_1_custom.mp4
Hi @onzone I think the Perceptual loss (visual quality loss) is still high (0.7). But the output video you posted has good quality. How is that possible? I am also training on LRS2 and am at 100k step. Eval sync loss ~ 0.25, but the visual quality is not that good, and the discriminator , perceptual loss are ~0.69-0.7 since 1st step. So, I am not able to know if visual quality is increasing? Can you give your insights reg. this, and tell what to infer and if my experiment is going fine?
I followed the instruction in README for training LRS2 dataset but found the fake/ real loss stays at 0.69 for a long time (more than 40 epochs definitely). Has anyone else experienced this problem before? If not, at which epoch did the fake/ real loss start to look normal?