Naver-AI-Hackathon / cs492I

2 stars 0 forks source link

Cuda out of memory #26

Closed erjui closed 3 years ago

erjui commented 4 years ago

I got an error as follows.

RuntimeError: CUDA out of memory. Tried to allocate 126.00 MiB (GPU 0; 23.88 GiB total capacity; 23.02 GiB already allocated; 14.88 MiB free; 23.25 GiB reserved in total by PyTorch)

First I thought this is just because of the large batch size or large size of base model. But the problem is that the error occurs sometimes and doesn't occur sometimes. Even I didn't get this kind of error one time and got error later with the same image size, same batch size, same base model architecture.

Thanks in advance.

erjui commented 4 years ago

OSError: [Errno 28] No space left on device

Even I just got this kind of error which shouldn't appear..

nsml-admin commented 4 years ago

please let me know your session names