Closed t-dan closed 2 years ago
enable gta in taco
OK, thank you. But where to find it? I can't find such a switch in Tacotron2 code.
@t-dan the code from extract_duretion
is all you need. https://github.com/TensorSpeech/TensorFlowTTS/blob/master/examples/tacotron2/extract_duration.py#L159-L166. Here we use teacher forcing.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Hello, in default setting, the vocoders are trained on mel-spectra computed from the real speech signals. When they are fed by the Tacotron-generated spectra, the quality is a bit lower.
I would like to try to fine-tune (or train from scratch, it does not matter) a vocoder from the synthesized (i.e. the Tacotron-generated) mel-spectrogram. However, there is an issue in it - while the real spectrograms are aligned with the original speech (#frames*hop_size = #samples), it is naturally not true for the synthesized data.
Did someone tried to experiment with this?
Thank you, DT