Closed creed2415 closed 10 months ago
I would look at loss_mel, loss_dur, loss_kl. loss_mel is reconstructed wav's mel and real mel difference. loss_dur is aligner and dur_pred difference. loss_kl is the distance between p(z_mel | mel) and p(z_mel | text)
based on these metrics([loss_disc, loss_gen, loss_fm, loss_mel, loss_dur, loss_kl,global_step, lr]) in the log