Kyubyong / tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Apache License 2.0
1.83k stars 436 forks source link

How do I use my own training data? #75

Open thuoctran opened 7 years ago

thuoctran commented 7 years ago

Hi everyone,

I've try to reduce the training dataset just only 1 folder, in my case in Genesis. I also changed the csv file, so that's only texts for the Genesis folder. But when I run the train.py, I got this problem.

Genesis/Genesis_1-1 Traceback (most recent call last): File "train.py", line 128, in main() File "train.py", line 108, in main g = Graph(); print("Training Graph loaded") File "train.py", line 35, in init self.x, self.y, self.z, self.num_batch = get_batch() File "/Users/thuoctran/Desktop/Projects/Tacotron/tacotron/data_load.py", line 125, in get_batch texts, sound_files = load_train_data() # byte, string File "/Users/thuoctran/Desktop/Projects/Tacotron/tacotron/prepro.py", line 45, in load_train_data texts, sound_files = create_train_data() File "/Users/thuoctran/Desktop/Projects/Tacotron/tacotron/prepro.py", line 33, in create_train_data sound_fname, text, duration = row ValueError: need more than 1 value to unpack

I'm using python 2.7. Can anyone help me to solve it?

Thank you in advance :)

zuoxiang95 commented 7 years ago

Do you edit batch_size to 1?

thuoctran commented 7 years ago

do you mean I should change the batch_size = 1 in a hyperparams.py file?

zuoxiang95 commented 7 years ago

yes

thuoctran commented 7 years ago

Well, I've tried to use my own samples. They're test_1 - test_3, and I also modified a text.csv into my own texts. I did changed a batch_size = 3 in a hyperparams.py file according to your advice. The files in my logdir_s folder is totally strange. They named such as mode_epoch_40_gs_0.index, and a checkpoint file is in text format (from Kyubuyong, his pretrain checkpoint format is unix executable). But there's no error when I run train.py file. Beside that, when I run eval.py, an error occurs (Floating point exception: 8).

zuoxiang95 commented 7 years ago

You should adjust the "min_len" and "max_len" in hyperparams.py to make sure that your data can be fed into training process.

thuoctran commented 7 years ago

the length of text is is from 25 to 75, so I think I don't need to change them. But why's my global steps (gs = 0) according to saved files in logdir?

yisiliu commented 7 years ago

You need to store your text in a csv file, in which each line is filename(without .wav), the transcript, duration(number of milliseconds). This error is shown in line sound_fname, text, duration = row

RobinsonSir commented 6 years ago

Anyone use chinese speech data? how to use it? http://www.openslr.org/18/