Closed hdmjdp closed 1 year ago
true duraion should not detach
Why not? It is not supposed to carry any gradients to the main net_g model. Also, I believe that logw itself doesn't have any gradients, since it is calculated in MAS attn. So, detach or not should not make difference. Am I missing something?
true duraion should not detach
Why not? It is not supposed to carry any gradients to the main net_g model. Also, I believe that logw itself doesn't have any gradients, since it is calculated in MAS attn. So, detach or not should not make difference. Am I missing something?
As you said, detaching the true label has no meaning.
https://github.com/p0p4k/vits2_pytorch/blob/cf513a71e07aed48448e582eaecfaef2a2b8d6b6/train_ms.py#L265
true duraion should not detach