Open sashasubbbb opened 1 year ago
I have the same problem, and there's example on how to use the CPU to generate audio.
I run it on 8GB VRAM with the following changes to generation.py
:
import gc
at the top, like under import requests
load_model
function to unload unnecessary models from memory when switching between text
, coarse
and fine
:
def load_model(ckpt_path=None, use_gpu=True, force_reload=False, model_type="text"):
_load_model_f = funcy.partial(_load_model, model_type=model_type)
if model_type not in ("text", "coarse", "fine"):
raise NotImplementedError()
# these are the changes
global models
models.clear()
gc.collect()
torch.cuda.empty_cache()
# /these are the changes
if torch.cuda.device_count() == 0 or not use_gpu:
device = "cpu"
else:
device = "cuda"
model_key = str(device) + f"__{model_type}"
if model_key not in models or force_reload:
if ckpt_path is None:
ckpt_path = _get_ckpt_path(model_type)
clean_models(model_key=model_key)
model = _load_model_f(ckpt_path, device)
models[model_key] = model
return models[model_key]
I'm sure there are better ways to optimise but that was good enough to run the notebook without running into torch.cuda.OutOfMemoryError
errors.
just great! This is exactly what people with 8GB vram GPUs need THANK YOU it works great!
For information: With 12GB VRAM it seems no changes need to be made.
@94awuna things are moving fast and I believe it's already been patched in the main repo. I also saw merges from the original in this repo made today, so it might already include changes to that effect. The code above is by no means anything more than an attempt - I've provided the entire bit to replace in the file as proof of concept only. I do not know what version you currently use, or if my crude changes still work since latest changes.
If you still want to go with it, open generation.py, add import gc
at the top, copy the block of code I included above, find def load_model(ckpt_path=None, use_gpu=True, force_reload=False, model_type="text"):
in the file, select until return models[model_key]
, and paste the block to replace your selection. These are the only changes I made.
@froger-me I've used your code to run on my 8GB VRAM GPU, and it appears to have worked nicely. Thank you! Does the code have any impact on the final output 'quality'? I apologize, but I don't fully understand what's being done here; I'm just testing the technology.
@ricardojuerge735 it won't impact the resulting audio (don't hold your breath if you want accurate voice cloning, my and many other people's attempt gave mixed to bad results, 8GB hack or not). It's just memory management could be done better than what I did, and it's the first time I edit python code, so I'm not very familiar with how to do it better.
How much VRAM is needed to generate audio? I get CUDA OOM on final step of generation. Is there any option to run this on 8gb VRAM? Like here? https://github.com/suno-ai/bark/issues/29