haoheliu / AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.
https://audioldm.github.io/audioldm2/
MIT License
216 stars 42 forks source link

Requirements #48

Open mateusztobiasz opened 1 week ago

mateusztobiasz commented 1 week ago

Hello!

I am a beginner and I've wanted to play with your finetuning script so I decided to run it on Google Colab. Just to test it out, I prepared train, test and eval datasets which contained only one audio with caption. When I ran this script, it consumed about 15 GB of GPU vRAM and the process was killed. Is this a normal behaviour? If so, do you guys know how much vRAM do I need to finetune this model on more reasonable datasets (about 1000 rows)?

Thank you in advance for your reply!

mateusztobiasz commented 1 week ago

I would like to also add that I haven't changed any hyperparameters and used audioldm_train/config/2023_08_23_reproduce_audioldm/audioldm_original_medium.yaml config. Also. here is the screenshot of the moment when process was killed:

image