Implementation with own dataset

TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

https://tensorspeech.github.io/TensorFlowTTS/

Apache License 2.0

3.82k stars 812 forks source link

Implementation with own dataset #264

Closed GavinStein1 closed 3 years ago

GavinStein1 commented 4 years ago

HI there,

I was wondering if there is support for implementing this repo with our own dataset - for purposes of cloning our own voices? Currently this project supports a limited amount of publicly available datasets.

If the above answer is in the negative, what changes are required to implement it myself?

Massive thanks for your work.

Edit: Additionally, does implementing your own dataset only pose issues during preprocessing, or during training as well?

ZDisket commented 4 years ago

@GavinStein1 If your dataset is of a language already supported here (I'll use English as an example) you can do it the lazy way - format your dataset like one that's already supported (I format my English ones like LJSpeech) and use the preprocessor already integrated. As for implementing your own dataset, there are instructions. As far as I am aware, custom datasets only pose a problem during preprocessing, not training, just make sure your config .yaml for the model you're training contains the dataset field set to the correct setting.