Multi-language Support?

Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

MIT License

2.27k stars 905 forks source link

Multi-language Support? #15

Closed lucasjinreal closed 6 years ago

lucasjinreal commented 6 years ago

Is there any multi language support planning, such as Chinese? Or any instructions for prepare dataset for training?

Rayhane-mamah commented 6 years ago

Hello @jinfagang, thank for reaching out.

I am supposing you're asking for something similar to this issue.

As proposed in that issue, I suppose that updating symbols under symbols.py (i.e: add Chinese characters in the _characters variable)

As for the preprocessing, for the moment, I only support data stored in the same folder format as Ljspeech (wavs folder + metadata.csv). If you have a dataset with the same structure, preprocessing should be done without any changes.

If the dataset structure is different, feel free to share the dataset with us, I will add support for it if it isn't much work :)

Thank you very much for your contribution!

lucasjinreal commented 6 years ago

I will work on that after I finish my graduate paper :). I wanna build an robot which can making voice.

Rayhane-mamah commented 6 years ago

That's an awesome project, good luck!

I would love to see how it goes.

On Fri, 13 Apr 2018, 03:48 JinTian, notifications@github.com wrote:

I will work on that after I finish my graduate paper :). I wanna build an robot which can making voice.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Rayhane-mamah/Tacotron-2/issues/15#issuecomment-381005126, or mute the thread https://github.com/notifications/unsubscribe-auth/AhFSwNPKrv6QQkcuxqNkTvlcU22W4T23ks5toBHigaJpZM4TRDbg .

Rayhane-mamah commented 6 years ago

UPDATE: You can refer to this issue for an example of my repo's adaptation to work with Chinese Mandarin.

@begeekmyfriend 's mandarin branch can be found here.

lucasjinreal commented 6 years ago

Thanks Ray. Will check it out.