Closed maevdokimov closed 2 years ago
Is there any reason you are tracking training loss as opposed to validation loss? Do you run your experiments for enough epochs such that the validation happens at least once? Can you run with pytorch-lightning==1.5.1 as well?
Can you upload lightning_logs.txt
from your experiment directory?
Hi! I'm trying to train tacotron 2 from scratch.
Running in current main branch with pytorch-lightning==1.5.0
Results in correct training without checkpointing. Downgrading to pytorch-lightning==1.4.2 fixes this issue.