What is the inference time of Tacotron model used in mimic2 in low footprint devices such as rasperry pi for say 10sec of audio? Looks like there is another TTS TTS_cube which seems much simpler
https://tiberiu44.github.io/TTS-Cube/ can we explore this for MyCroft
What is the inference time of Tacotron model used in mimic2 in low footprint devices such as rasperry pi for say 10sec of audio? Looks like there is another TTS TTS_cube which seems much simpler https://tiberiu44.github.io/TTS-Cube/ can we explore this for MyCroft