daitao / SAN

Second-order Attention Network for Single Image Super-resolution (CVPR-2019)
550 stars 106 forks source link

CUDA out of memory (default setting) #6

Open jiahaooo opened 5 years ago

jiahaooo commented 5 years ago

"RuntimeError: CUDA out of memory. Tried to allocate 324.00 MiB (GPU 0; 10.73 GiB total capacity; 9.08 GiB already allocated; 290.31 MiB free; 612.32 MiB cached)"

Training on a 2080ti gpu under the same setting as training demo (--n_resgroups 20 --n_resblocks 10)

Please kindly let me know how to deal with it if this is possible.

Zysty commented 5 years ago

I encounter the problem,too. Maybe that because of the batch_size. I find it is set to 16 in code while it is set to 8 in paper. But I am not sure if this will affect the results.

qinhuangdaoStation commented 4 years ago

I also changed the batch_size to 8, and n_resgroups to 5, it still output the error:

RuntimeError: CUDA out of memory. Tried to allocate 5.96 GiB (GPU 0; 11.00 GiB total capacity; 6.17 GiB already allocated; 2.42 GiB free; 159.09 MiB cached)

My GPU is NVIDIA 2080TI, so how should I do?

Missdonghui commented 4 years ago

I set the same configuration and parameters as the author, even also changed the batch_size and n_resgroups to small,but why is it always wrong? RuntimeError: cuda runtime error (2) : out of memory at/opt/conda/condabld/pytorch_1524584710464/work/aten/src/THC/generic/THCStorage.cu:58 My GPU is NVIDIA 1080TI, please tell me something about it,thanks a lot.

daitao commented 4 years ago

Do you mean the error in the training or test stages? The version of pytorch also matters.

Missdonghui notifications@github.com 于2019年10月25日周五 上午11:18写道:

I set the same configuration and parameters as the author, even also changed the batch_size and n_resgroups to small,but why is it always wrong? RuntimeError: cuda runtime error (2) : out of memory at/opt/conda/condabld/pytorch_1524584710464/work/aten/src/THC/generic/THCStorage.cu:58 My GPU is NVIDIA 1080TI, please tell me something about it,thanks a lot.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/daitao/SAN/issues/6?email_source=notifications&email_token=ACKPUKN3FUISUSILKLFVZGDQQJQOVA5CNFSM4H3YP632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECHBULI#issuecomment-546183725, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKPUKNAEP2KTXYY55PCASTQQJQOVANCNFSM4H3YP63Q .

-- Best regrads!

Tao Dai (戴涛) Ph.D Candidate Department of Computer Science and Technology Tsinghua University, Shenzhen, China Email:daitao.edu@gmail.com

urmagicsmine commented 4 years ago

您好,我使用了多个GPU,但有以下错误 RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 1 does not equal 0 (while checking arguments for cudnn_convolution)

I meet the same problem. Have you solved this?