nii-yamagishilab / multi-speaker-tacotron

VCTK multi-speaker tacotron for ICASSP 2020
BSD 3-Clause "New" or "Revised" License
265 stars 41 forks source link

inference speed #10

Open Adibian opened 2 years ago

Adibian commented 2 years ago

Hi. I am exploring about speed of training and inference different multi speaker TTS models on single CPU or on singe GPU. Thanks for any explanation in this case for current model or any other models of multi speaker TTS.

ecooper7 commented 2 years ago

Hi, on VCTK data, multi-speaker training from scratch took four days on one GPU, and warm-starting a multi-speaker model from a pre-trained single-speaker model took one day of training on VCTK data. Hope that helps.