Open Fmstrat opened 7 months ago
I don't know exactly how the low_vram option works, so if you've found more since this post, my info might be outdated, but from my knowledge it loads the models when it needs to and unloads them when it's done. I'd assume that it should be possible to balance maybe what's loaded onto memory, but I haven't worked in those constraints so haven't tinkered much with the process.
6GB cards have trained with this repo according to some users, though, I think the minimum is 4. The dvae.pth is essential to training as it converts the training inputs into discrete tokens that the GPT model can learn from.
I can do a PR for this, I just need direction.
I've already updated to allow WhisperX to run with the
small
model whenlow_vram
is selected, but I'd like to include training in that PR. However, I'm not sure if this is possible as I'm new to torch, and am getting CUDA memory errors on training.Is this simply because
dvae.pth
is too large for 6GB memory training? And if so, is there an alternative?