JarodMica / ai-voice-cloning

GNU General Public License v3.0
430 stars 87 forks source link

Low VRAM support for training #82

Open Fmstrat opened 2 months ago

Fmstrat commented 2 months ago

I can do a PR for this, I just need direction.

I've already updated to allow WhisperX to run with the small model when low_vram is selected, but I'd like to include training in that PR. However, I'm not sure if this is possible as I'm new to torch, and am getting CUDA memory errors on training.

Is this simply because dvae.pth is too large for 6GB memory training? And if so, is there an alternative?

JarodMica commented 2 months ago

I don't know exactly how the low_vram option works, so if you've found more since this post, my info might be outdated, but from my knowledge it loads the models when it needs to and unloads them when it's done. I'd assume that it should be possible to balance maybe what's loaded onto memory, but I haven't worked in those constraints so haven't tinkered much with the process.

6GB cards have trained with this repo according to some users, though, I think the minimum is 4. The dvae.pth is essential to training as it converts the training inputs into discrete tokens that the GPT model can learn from.