Closed kwoot closed 5 years ago
You need to generate utterance structures using a TTS front-end, these can be converted into label files (.lab
).
This script should get you started: https://github.com/CSTR-Edinburgh/merlin/blob/master/egs/fls_blizzard2017/s1/scripts/prepare_labels_from_txt.sh
See this for more context on usage: https://github.com/CSTR-Edinburgh/merlin/blob/master/egs/fls_blizzard2017/s1/merlin_synthesis.sh
That's not technically true. You need to generate label files. The utt files are an intermediate step that's not required. It just happens to be the easiest way to generate lab files using festival. But the links you provide should help somebody to generate label files consistent with those used for training assuming the user used the provided training scripts.
Hi, I am making a podcast to compare different tts programs on Linux. I can generate the full voices using run_full_voice.sh but generating a wav after that with my own simple text string seems undoable. How difficult is it, once the learning is done, to generate a new wav file from a string? Kind regards, Jeroen Baten