Open flesnuk opened 1 year ago
Since I only have 8GB of VRAM I can't run it either way, even with small model and --dev-batch-size 1. 😅
Thanks! I'm gonna look into it, but it may take a bit because I don't have a Windows machine. For training on small VRAM, using the tiny model or specifying --train-only-decoder
might help.
Hi @jumon no problem, I don't full understand the issue, but using the num_workers=0 is working as far I can see.
For the VRAM I see an easy solution is using bitandbytes which I intend to open a PR today. Using Adam 8bit optimizer I can get to train small model with 8GB of VRAM in Windows. I will add a flag to make it optional using 8bit optimizer.
For the VRAM I see an easy solution is using bitandbytes which I intend to open a PR today.
That would be nice! Thanks!
When running on Windows there is this error
TypeError: cannot pickle 'builtins.CoreBPE' object
I only found this relevant thread while googling. https://discuss.pytorch.org/t/pytorch-windows-eoferror-ran-out-of-input-when-num-workers-0/25918
And by setting the num_workers to 0 in the dataloader class seems to work. I don't know the implications of this or if there is any way to fix the error. But it may be useful for someone with the same situation.