Closed karrdy89 closed 2 years ago
@karrdy89 you need training tacotron2 to extract duration for fs2. Around 50k-80k is good enough for extract duration.
I get it. Thanks for the reply :)
@karrdy89 you need training tacotron2 to extract duration for fs2. Around 50k-80k is good enough for extract duration.
if there is small unseen speaker data, hot to extract duration info?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Hello. Thanks for the great repo. I am trying to train a pre-trained fastspeech2 model(kss). It seems that a duration is needed for learning, which can be extracted by training a tacotron. I have a question here. isn't the duration necessary for learning tacotron2? also, how many epoch is required for accurate extraction?