Closed pgm-n117 closed 2 years ago
Yup, this is a problem with your dataset, check that there is no problem with it !
Hi, I have the same error, the dataset was downloaded here. Have you solved it? Could you please provide any suggestions on how to build the dataset directory or whether the code needs a manual list of folders containing the .wav files?
Thanks in advance.
This error means that the lmdb database is empty, i.e no audio has been preprocessed and loaded into it ! You can try and use the resample
utility provided with RAVE (you might want to re-run pip install -r requirements.txt
though)
I could pass the original issue with the dataset, but now i'm stuck with this error. (Updated repo to last commit):
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
------------------------------------------------------
0 | pqmf | CachedPQMF | 16.7 K
1 | loudness | Loudness | 0
2 | encoder | Encoder | 4.8 M
3 | decoder | Generator | 12.8 M
4 | discriminator | StackDiscriminators | 16.9 M
------------------------------------------------------
34.6 M Trainable params
0 Non-trainable params
34.6 M Total params
138.202 Total estimated model params size (MB)
Validation sanity check: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 4.03it/s]Traceback (most recent call last):
File "/home/pablo/RAVE/train_rave.py", line 154, in <module>
trainer.fit(model, train, val)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
self._run(model)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 922, in _run
self._dispatch()
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 990, in _dispatch
self.accelerator.start_training(self)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
self.training_type_plugin.start_training(trainer)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
self._results = trainer.run_stage()
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1000, in run_stage
return self._run_train()
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1035, in _run_train
self._run_sanity_check(self.lightning_module)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1122, in _run_sanity_check
self._evaluation_loop.run()
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 118, in run
output = self.on_run_end()
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 133, in on_run_end
self.evaluation_epoch_end(outputs)
File "/home/pablo/.local/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 243, in evaluation_epoch_end
model.validation_epoch_end(outputs)
File "/home/pablo/RAVE/rave/model.py", line 661, in validation_epoch_end
pca = PCA(z.shape[-1]).fit(z.cpu().numpy())
File "/home/pablo/.local/lib/python3.9/site-packages/sklearn/decomposition/_pca.py", line 382, in fit
self._fit(X)
File "/home/pablo/.local/lib/python3.9/site-packages/sklearn/decomposition/_pca.py", line 457, in _fit
return self._fit_full(X, n_components)
File "/home/pablo/.local/lib/python3.9/site-packages/sklearn/decomposition/_pca.py", line 475, in _fit_full
raise ValueError(
ValueError: n_components=128 must be between 0 and min(n_samples, n_features)=32 with svd_solver='full'
any idea on this?
How large is your dataset ?
I used 28 wav files, 25MB in total. Its not a large dataset, but i used it for testing
I'm not sure how you've solved your previous problem, but your dataset is so small that I think the validation set is still empty !
I'll try a larger one then. Is there a minimum size? I only have an RTX 2060 in hand for trainings
No matter what GPU you have, a larger dataset will always produce better results. It really isn't the number of epoch that counts in this case, but rather the number of training steps.
I know i'll get better results with a larger dataset, but at this time I dont have access to better hardware, and wanted to try the training on mine, and of course with lower end GPUs and lower memory, it will take longer. Anyway, I'll keep trying with a larger one and hope it works. Thank you!
@caillonantoine thanks for the help and sorry about that lingering issue. I figured it out :). I wasn't using a big enough sample size for the model (I think this is a similar issue here). Anyway, I am currently training two models with different capacities. The smaller one (the default) is at around 850K steps, and the larger one just about 250K steps. I heard stopping around one million is a good heuristic, although I am tempted to let it run until the reconstructions have minimal distortion. Can you share any insights?
Thanks for your time.
What sample size where you using ? The default is 2^16, which is approximately 1.5s at 44.1kHz ! You should look at the distance loss in tensorboard, when it starts to plateau, you should either halve the learning rate or switch to phase 2 ! :)
I was using the default. I did not see a learning rate option in the train_rave.py
script, where do I change this? also, what is phase 2 exactly? Is that exporting rave and training the prior?
You're gonna have to do it manually in side rave/model.py
. For any question on the model itself I suggest that you read the article https://arxiv.org/abs/2111.05011
Thanks for all the help! will give that a try :). p.s the paper is super helpful.
Hi, I was trying to launch train_rave.py with a dataset for testing. I am using 310 .wav files with the cli_helper.py, which returned me an error like this:
$ python train_rave.py --name training1 --wav ./dataset/1 --preprocessed /tmp/rave/training1/rave ... 5_38_26.wav: 99%|███████████████████████████▊| 308/310 [00:46<00:00, 6.69it/s]/home/pablo/.local/lib/python3.9/site-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.")
2_38_13.wav: 100%|███████████████████████████▉| 309/310 [00:46<00:00, 6.60it/s]/home/pablo/.local/lib/python3.9/site-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.")
2_38_13.wav: 100%|████████████████████████████| 310/310 [00:46<00:00, 6.61it/s] Traceback (most recent call last): File "/home/pablo/RAVE/train_rave.py", line 77, in
dataset = SimpleDataset(
File "/home/pablo/.local/lib/python3.9/site-packages/udls/simple_dataset.py", line 83, in init
raise Exception("No data found !")
Exception: No data found !
Have you seen this error before? I am trying to use it with a GPU, but launching only with CPU throws the same error. Maybe there is a problem with the dataset?
Thanks!