what should the development set's content be in a speech dataset and g2p?

albluc24 commented 5 years ago

this is my first github issue, so please forgive me if there are any mistakes. The problem i'm having right now though is simply not understanding what should be contained in a development folder of a training set What I've done. I've downloaded the M-AILABS italian training set, and have splitted the csv in txt files such as every one of them are corresponding to a wav file, and that's for the training set. My question is: what should i put in the other folder? The readme says that there should not be more than 5 files in there, but when i start training with an empty dev folder it gives me an error about a lab file that was not found. I have the same doubt about the g2p thing, but as i'm not going to use that feature that's a secondary thing for me, as well as adding custom things in the lab file which, in fact, i've not added any.

tiberiu44 commented 5 years ago

Hi @albluc24 ,

I don't think you need to worry about the g2p model for Italian. Next, you should at least move a couple of files (2-5) from the training folder to the development folder. And, in case you missed it in the documentation, you must not add to the project's data folder directly. You need to run the import step and it will automatically generate resampled wav files, .labs and a whole other bunch (including rendered spectrograms in .png format).

Also, before starting training, you should check a couple of file to see if they have the correct content. Also, you can use the available vocoders (in order to avoid training your own, which takes months on low-end hardware), but you need to specify --target-sample-rate=16000 for each step.

Let me know if you need anything else. Also, feel free to post any fragment of you lab file if you want me to check for correctness.

Best, Tibi

albluc24 commented 5 years ago

hi and thanks for the reply! As I said i'm not using any lab file in the dataset, that's the reason why i don't understand the program asking me for one. About the development folder, I assume i should put 2 pairs of txt and wav files, so that it doesn't hit the recommended maximum value. From what i grasped, the development set's purpose is to have material to synthesize after a number of steps, is that right? If by checking the files you ment the pngs, I'm afraid I can't do that due to my complete blindness. I could ask someone to describe the image for me, but I have no idea how an audio spectrum is composed and thus couldn't tell the person what should a correct output look like. Thanks again for your reply, have a good day!

tiberiu44 commented 5 years ago

Can you paste the command you used for the import step?

albluc24 commented 5 years ago

sure: The following is the input(my command) and the output that the application gives. I'l try to copy that as good as possible, as i have to use a quite tricky way to do that, intercepting the text from the screen reader. Anyway, here goes. Note: the /home/riccardodev folder, the development one, is empty. I'll try to put something in there later, i'm preparing that right now. input python3 cube/trainer.py --phase=1 --train-folder=/home/riccardo/ --dev-folder=/home/riccardodev/ output Scanning training files... found 18134 valid training files
Scanning development files... found 0 valid development files
processing file 1/18134Traceback (most recent call last):
File "cube/trainer.py", line 399, in
phase_1_prepare_corpus(params)
File "cube/trainer.py", line 253, in phase_1_prepare_corpus
join('data/processed/train', tgt_lab_name), speaker_name=params.speaker, g2p =g2p)
File "cube/trainer.py", line 107, in create_lab_file
fout = open(lab_file, 'w')
FileNotFoundError: [Errno 2] No such file or directory: 'data/processed/train/ma ttiapascal_18_pirandello_f000070.lab'
I already tryed to remove the directories stated in the readme to remove a model to see if that was the problem, still the same.

tiberiu44 commented 5 years ago

I think you have some folders missing. In the TTS-Cube folder do:

mkdir -p data/processed/train mkdir -p data/processed/dev mkdir -p data/output

tiberiu44 / TTS-Cube

what should the development set's content be in a speech dataset and g2p? #36