Closed erogol closed 5 years ago
@erogol Wow, that sounds great - congrats!
Sure, let's put it in here. I had a quick look at your repo and it seems most of the functionality is the same besides the mixture of logistics and some of the dsp preprocessing hparams? Anything else I'd need to do?
Also, if you don't mind - how many steps did you train that model for? And did you use the batched generation for that clip?
@fatchord thanks!
I guess we need to add raw bit training mode as well to the branch. It's supposed to set by config.json mode
.
It was trained for a long time. Around 1m steps. But I don't know when it started to work. I've not checked all the checkpoints.
@erogol Sorry for the delay getting back to you. I haven't forgotten - I'm just going to finalise a couple of things with the vanilla tacotron one model and then start training the vocoder on MOL.
@erogol @fatchord
Glad to see you guys work together ! I tried @fatchord 's vanilla TTS with Quick Start
, the samples is impressive with good quality and fast synthesis speed (~12khz @ GTX 1080Ti). Some unnatural part I feel is about coherence between words, maybe you may try as @erogol did of Location Sensitive Attention.
Hope to see the TTS + WaveRNN work both fast and low-computation :-p
@mazzzystar I just uploaded new pretrained models and the sound quality is a bit better.
I listened to the Soundcloud example and it's pretty amazing! congratulation ! Is the voice model up for sale ? I have a project that will have TTS implemented in it and I would love to use that voice. Also I'm going to use the SAPI5 Microsoft TTS engine so would be compatible with it ? (I need to use the Microsoft SAPI because I'm using the python pyttsx3 module to generate the voice and it uses it)
Example result: https://soundcloud.com/user-565970875/ljspeech-logistic-wavernn
Here is the [branch] (https://github.com/erogol/WaveRNN/tree/mold) if you like to try. The model has trained with TTS spectrograms on LJSpeech dataset. Models are soon to be released.
@fatchord would you prefer to have the trained WaveRNN model here, or better to have a new repository for this?