Closed koala7580 closed 3 years ago
Hi @koala7580, sorry for the late reply. Unfortunately, under the same condition as pre-trained models, I did not get permission to share samples from the pre-trained models publicly.
I used the datasets to show how to handle raw data (e.g., video, noisy formatted) for (supervised) non-autoregressive TTS. However, for your reference, I would like to say that you may use RAVDESS or other clean datasets to get crystal-clear audio since the datasets in this project are not purposed on TTS. Or you may use speech enhancement model to denoise output audios (I confirmed that this method work), keep training on provided datasets.
Any example from anyone?
The synthesized speech of IEMOCAP data is very noisy.
IEMOCAP数据的合成语音非常嘈杂。
我也是,合成的音频嘈杂且有电音。这已经是用FullNet对原数据集进行语音增强以后,训练的结果了。请问你解决了吗
Can you share your synthetic audio? thx