Closed HadhamiRjiba closed 3 years ago
Is there any way to know how big a model or a network my system can handle without running into this issue?
Your GPU ran out of memory. Try lowering the batch size.
yes!. I reduced the batch size to 1 and now it works with train.py.. but the problem now is in running eval.py (is showing the same error) : "RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 4.00 GiB total capacity; 2.49 GiB already allocated; 44.45 MiB free; 2.57 GiB reserved in total by PyTorch)"".. have you any idea ??
Hmm. The batch size is 1 by default in eval.py. You could try lowering seq_length
. Looks like you only have 4 GB of GPU memory. I used a GPU with 12 GB memory.
@wmcnally it works..thank you
Hello, when i am running "python train.py ",, i have this message in below: " **training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 1.07 GiB (GPU 0; 4.00 GiB total capacity; 2.57 GiB already allocated; 84.45 MiB free; 2.59 GiB reserved in total by PyTorch)**"
Any idea what might cause this ?