Open marshonhuckleberry opened 4 years ago
afterwards you could tell how, i do a collections of vocoders old and new to evaluate them, ur code might be usefull to developers
My vocoder needs input is the spectrogram of audio, so you need to generate it somehow (i.e. train neural network to predict spectrogram given text). After that, it's easy to follow the guide to generate audio:
# generate spectrogram of audio
$ python gen_spec.py -i sample.wav -o out.npz
# synthesis audio from spectrogram
$ python synthesis.py --model_path path/to/checkpoint \
--spec_path out.npz \
--out_path out.wav
Hi @marshonhuckleberry , Thanks for interesting in this work. This is just a vocoder, not a full text-to-speech system, which converts audio features into sound. I worked on this repo in about 2018. At this time, vocoders were too slow to generate sound (i.e. wavenet). It's just a hobby project, and I'm no longer working on this anymore. If you interest in tts, please use other repos like mozilla/tts or espnet,... Thanks.