Closed Desuka-art closed 1 year ago
How much VRAM do you have?
24 GB of VRAM. I have a 3090. By the way, I have another issue. My CPU is taking the brunt of the training for So-vits, and I don't know how to optimize the speed.
My dataset is 10 minutes long. 64 files.
No, I have not. What number should I change the batch_size to?
That's odd; you shouldn't be running out of VRAM with a dataset that size, but you could try lowering the batch_size to 8. What OS are you using? Can you post logs/errors from before the OOM?
Also -- does your system have an integrated GPU?
I'm using Windows 10. My CPU is an i7-8700. I believe I may have an integrated GPU? I'm not sure? Intel UHD Graphics 630. What do you think? I have tried redirecting the directory, to no avail. I have no idea what I'm doing wrong.
Ok--I thought that you might be using pytorch cpu but then it occurred to me that you probably wouldn't get a CUDA OOM if you were. I'm not too sure what's going on here either.
I'm checking it using the Task Manager. It says specifically I'm using CUDA.
The maximum length is 10 seconds.
Ok.
nvidia smi shows I have CUDA installed. 12.0
No, I am not.
try to run those code lines in terminal and see what responses it gives you:
python -c "import torch; print(torch.cuda.is_available())" python -c "import torch; print(torch.version.cuda)" python -c "import torch; print(torch.zeros(1).cuda())"
True 11.7 tensor([0.], device='cuda:0')
Still OOMing? EDIT--I checked my nvcc --version and it's actually 11.4, sorry
So... do I just... change my cuda version then to 11.4? What would be the command for it?
What is your nvcc --version? You would have to uninstall CUDA (I think you can do this through the Control Panel) and replace it with the desired version.
Related question, do I remove the ddp line? Where do I do that? I only have one GPU, a 3090.
I only have one GPU as well and I do not have to make any changes to the code to train. If you run nvcc --version in the Command Prompt or PowerShell it should spit out some text about your CUDA version.
I keep getting CUDA out of memory errors and I don't know what to do.