carpedm20 / multi-speaker-tacotron-tensorflow

Multi-speaker Tacotron in TensorFlow.
http://carpedm20.github.io/tacotron
Other
632 stars 341 forks source link

Multi-Speaker for Large number of speakers #44

Closed vijaysumaravi closed 6 years ago

vijaysumaravi commented 6 years ago

How do I train the model if I want to train it on a large number of speakers (~100) from the VCTK Corpus?

Currently I do :

python3 train.py --data_path=/data/home_ext/vijay/vctk_database/wav22/*

Where wav22 is a directory consisting of 100 different speakers' audio files.

I have modified my train.py as follows:

(-) config.data_paths = config.data_paths.split(",") (+) config.data_paths = glob.glob(config.data_paths)

I get the following error:

Traceback (most recent call last): File "train.py", line 339, in main() File "train.py", line 314, in main prepare_dirs(config, hparams) File "/home/vijay/speech_tts/multitacotron/utils/init.py", line 53, in prepare_dirs os.makedirs(path) File "/usr/lib/python3.5/os.py", line 241, in makedirs mkdir(name, mode) OSError: [Errno 36] File name too long: 'logs/data+p305+p313+p262+p244+p236+p303+p281+p318+p326+p374+p251+p329+p293+p284+p263+p300+p314+p333+p258+p282+p339+p287+p243+p255+p294+p343+p270+p253+p299+p227+p279+p312+p254+p285+p267+p231+p316+p308+p245+p240+p302+p266+p261+p273+p268+p304+p276+p298+p347+p260+p247+p277+p228+p306+p233+p288+p250+p360+p311+p280+p335+p341+p361+p257+p246+p376+p271+p345+p283+p364+p330+p301+p295+p248+p269+p323+p256+p317+p249+p334+p286+p238+p297+p292+p336+p278+p234+p274+p226+p363+p307+p252+p259+p237+p351+p264+p225+p275+p232+p340+p230+p310+p229+p241+p239+p265+dirs.txt+p272+p362_2018-07-24_15-53-43'

vijaysumaravi commented 6 years ago

I was able to fix this by modifying how the model naming was done. But now I get some other errors.

Starting new training run at commit: None Generated 8 batches of size 216 in 0.000 sec Traceback (most recent call last): File "/home/vijay/speech_tts/multitacotron/datasets/datafeeder.py", line 204, in run self._enqueue_next_group() File "/home/vijay/speech_tts/multitacotron/datasets/datafeeder.py", line 229, in _enqueue_nextgroup for in range(int(n self._batches_per_group // len(self.data_dirs)))] File "/home/vijay/speechtts/multitacotron/datasets/datafeeder.py", line 229, in for in range(int(n self._batches_per_group // len(self.data_dirs)))] File "/home/vijay/speech_tts/multitacotron/datasets/datafeeder.py", line 257, in _get_next_example data_path = data_paths[self._offset[data_dir]] IndexError: list index out of range

vijaysumaravi commented 6 years ago

This issue was happening because of min_num_tokens in hparms was 50 and most of my files had num_tokes < 40. Changing the hparms fixes the issue.

crayonyak commented 6 years ago

Hi. Thanks for your solution. I had a same problem with you.(list index out of range)

So, I modified min_num_tokens as you did(I changed it as 10 just for test) and it works!

However, I'm confusing. "what does Min number of token means exactly" and "what is the best number of min token".

If you don't mind, I want to ask about these questions.

Thanks.

faaip commented 5 years ago

Hi @vijaysumaravi . I'm attempting to do multi-speaker with VCTK as well. Can you share the alignment.json file? Thanks.