enhuiz / vall-e

An unofficial PyTorch implementation of the audio LM VALL-E
MIT License
2.96k stars 419 forks source link

ValueError: No valid path is found for training. #93

Open enorrmann opened 1 year ago

enorrmann commented 1 year ago

I followed the instructions for "test" folder when I try to run

python -m vall_e.train yaml=config/test/ar.yml

running on linux and my files are . ├── config │   ├── LibriTTS │   │   ├── ar-quarter.yml │   │   ├── ar.yml │   │   ├── nar-quarter.yml │   │   └── nar.yml │   └── test │   ├── ar.yml │   └── nar.yml ├── data │   └── test │   ├── test.normalized.txt │   ├── test.phn.txt │   ├── test.qnt.pt │   └── test.wav

my ar.yml is

data_dirs: [data/test]

model: ar-quarter batch_size: 1 eval_batch_size: 1 save_ckpt_every: 500 eval_every: 500 max_iter: 1000

MSDNAndi commented 1 year ago

I ran into the same problem. I needed to have two different pairs of files (.wav and .txt) to make it work.

enorrmann commented 1 year ago

I ran into the same problem. I needed to have two different pairs of files (.wav and .txt) to make it work.

thank you that did the trick

constan1 commented 1 year ago

Hello. What do you mean by two different pairs of files? I have the .qnt.pt and phon.txt and normalized.txt and wav files under my directory data/librosa

the config files are config/librosa with the ar.yml.

I ran into the same problem. I needed to have two different pairs of files (.wav and .txt) to make it work.

Annoytanor commented 1 year ago

Your phenome files need to have between 10 and 50 phonemes in them. Try using shorter audio clips, even 10 second clips can be too long.

constan1 commented 1 year ago

Your phenome files need to have between 10 and 50 phonemes in them. Try using shorter audio clips, even 10 second clips can be too long.

My training samples are longer usually. I increased the max_phon = 5000. Would this reduce performance?

kaaancan commented 1 year ago

I figured out the issue after some debugging with the help of chatGPT. Its actually super stupid and simple 🤦 image

You need to provide 2 audio and 2 normalized.txt samples

RinLovesYou commented 1 year ago

I have provided two audio files and two normalized txt, i still get no valid path is found for training