Closed wwerkk closed 1 year ago
My bad, if I went through the README properly I would've read this part :) https://github.com/crlandsc/tiny-audio-diffusion#3-define-environment-variables
Which brings me to the question if there is a way to train/finetune a model without using wandb?
I'm still having some issues with the checkpoint not being saved. Wandb claims it has been saved and gives a path, though directories included in that path do not actually exist.
Hi @wwerkk - Thanks for bringing this to my attention. I added functionality to be able to train without using wandb. You will have to pull the repo to update the module file to add this functionality. Also, you can use the new drum_diffusion_no_wandb.yaml
to train without wandb (it is just drum_diffusion.yaml
with the loggers
and audio_samples_logger
deleted).
On a related note, wandb doesn't actually save the checkpoints, they are saved via PyTorch Lightning under the logs/ckpts
folder on your local computer. Wandb is implemented more to keep track of metrics and generate outputs to listen to how your model is performing.
Let me know if this fixes the problem!
My problem turned out to be related to the length of training - checkpoint file indeed gets saved in a proper path after validation :)
Just tested the drum_diffusion_no_wandb.yml config and it seems that it still requires a valid wandb username and API key and the wandb run gets created as normal. Do I understand correctly that the difference between the configs, is that with no_wandb the training history does not get synchronized with the cloud?
Glad you figured out the issue! You can change how often the checkpoints are logged with line 16 in the exp/drum_diffusion.yaml
or exp/drum_diffusion_no_wandb.yaml
files.
val_log_every_n_steps: 1000 # Logging interval (Validation and audio generation every n steps)
Yes, that is correct about the configs. The new exp/drum_diffusion_no_wandb.yaml
config basically just takes out anything related to wandb logging and keeps everything local.
I am puzzled why it is still requiring wandb credentials. Did you pull the repo to update the main/diffusion_module.py
file before you trained again? I had to make a small tweak in this file to fix it for the exp/drum_diffusion_no_wandb.yaml
config.
Seems like another newbie mistake, command I was running repeatedly had the exp argument doubled, second of them still pointing to the drum_diffusion.yaml
config so the first one was overwritten.
You can change how often the checkpoints are logged with line 16 in the
exp/drum_diffusion.yaml
orexp/drum_diffusion_no_wandb.yaml
files. That's what I was thinking, that perhaps saving occurs on validation. When fine-tuning on small amounts of data it would take a while for the validation to take place makes sense!
Speaking of dataset sizes - approximately how large were the datasets the kicks/snares models were trained on? Just wondering how much data would be the lower bound for training/tuning.
Thank you so much for the responses, I'm very much looking forward to some finely-tuned generation soon :)
I actually realized that I had the exp
argument doubled in the instructions in the readme when you first opened this issue, and I fixed it then - so that one is on me!
The kicks/snares/hi-hats datasets were pretty small, as I was just working with some open-source samples that I had gathered (150-200 samples). So there is definitely a lot of room for fine-tuning! The Percussion model, however, was trained on over 1,000 samples, which is a more appropriate data size. These were just the first models for proof-of-concept and hopefully I can train some better, more diverse models in the future when I collect more data.
Please share any models that you train, I'm incredibly interested to hear what you come up with!
If everything is up and running now, I will close this issue. Have fun training!
When running the training script:
I get the following error:
From what I understand the problem might be about setting the the logs directory, but not sure how to go about it, since I do not have much experience with Hydra.