I've noticed a detail in decoder prenet that, it also uses dropout at inference. There is a comment saying that "# use dropout also in inference for positional encoding relevance". I've also tried disabling dropout in inference but the generated audio is a mess. Is there a more detailed explanation for this?
I've noticed a detail in decoder prenet that, it also uses dropout at inference. There is a comment saying that "# use dropout also in inference for positional encoding relevance". I've also tried disabling dropout in inference but the generated audio is a mess. Is there a more detailed explanation for this?
Thank you!
https://github.com/as-ideas/TransformerTTS/blob/e4ded5bf5a488aab98ce6aee981e3ac0946f4ddc/model/layers.py#L397