Closed shock-wave007 closed 4 years ago
You can cd to the examples/
directory to try each examples.
For text to speech task, you can try deepvoice3
, transformerTTS
and fastspeech
. Just following the README.
i tried wavenet i face 2 problem
Wavenet is a vocoder, which turns spectrogram(more specifically mel spectrogram) into audio. So wavenet does not directly transform text into audio. You have to use another text to spectrogram
model.
As wavenet is an autoregressive model, it generates sample points one by one in a sequential manner. But for an audio file with sample rate 22050, it has 22050 sample points per second. As this is a simple implementation of wavenet without special optimization, in our practice, it takes 4 to 5 hours to synthesize 10 seconds of audio. So it is as expected. If you want a fast vocoder, you can try clarinet
and waveflow
, which are comparable in quality but much faster.
Thank you very much for ur help and knowledge,
I can't find how to input/set TEXT to be converted to voice in clarinet or waveflow documentation
Also, clarinet
and waveflow
are vocoders.
We have implemented 3 tts models, deepvoice3
and transformerTTS
and fastspeech
, they have text to spectrogram
part and vocoder
part. (See README
and synthesize.py
in the correspoinding folder in examples
. For example, the synthesize.py
in deepvoice3 can take as parameter a text file, one sentence per line to synthesize.) But for simplicity and focusing on the text to spectrogram
part, some of them now only use a simple non neural network based vocoder griffinlim
.
We are now working on integrate our text to spectrogram model with neural vocoders. When it is done, we will release new examples.
Have a text file, one sentence per line, and pass the text file as the input of the synthesize.py
. You can run
python synthesize.py --help
to see detailed usages.
examples/deepvoice3
and examples/transformertts
and fastspeech
all have a synthesize.py
included. If you have any problem using them, please let us know, thank you.
thanks for ur help i was able to test deepvoice 3. ai ml dl stuff is hard (for me and my pc)
its nice but hard i think i will leave this to EXPERT people like you. :)
can anyone tell me how to use this library to convert TEXT to Speech
-i have clone the library, pre trained data , LJSpeech datasets
Q0. can i use this library to convert from text to speech? Q1. if yes can u tell the step by step command!
iam new to python and AI ML DL. but have basic programing understanding
thanks in advance