facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.59k stars 2.09k forks source link

What compute resources are required to fine-tune MusicGen? #215

Closed sbrother closed 1 year ago

sbrother commented 1 year ago

I just attempted to fine-tune a MusicGen model with a custom dataset using

dora run solver=musicgen/musicgen_base_32khz model/lm/model_scale=small continue_from=//pretrained/facebook/musicgen-small conditioner=text2music dset=audio/my-dataset

Unfortunately it ran into torch.cuda.OutOfMemoryError: CUDA out of memory on a single H100 instance. As I attempt to procure a larger cluster, it would be really helpful to know how much memory and compute time is typical for training the small, medium, and large MusicGen models. Thanks!

sbrother commented 1 year ago

This was because the default solver config has a batch size configured for 32 GPUs :) I changed that and am running into other issues, but I'll close this.