CSAILVision / semantic-segmentation-pytorch

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset
http://sceneparsing.csail.mit.edu/
BSD 3-Clause "New" or "Revised" License
4.93k stars 1.1k forks source link

RuntimeError: CUDA error: out of memory #229

Open HLH13297997663 opened 4 years ago

HLH13297997663 commented 4 years ago

I just use one image to test, but get the error below:

File "test.py", line 203, in main(cfg, args.gpu) File "test.py", line 129, in main test(segmentation_module, loader_test, gpu) File "test.py", line 79, in test scores = scores + pred_tmp / len(cfg.DATASET.imgSizes) RuntimeError: CUDA error: out of memory

GutlapalliNikhil commented 3 years ago

Hi @HLH13297997663 , is your error solved? Could you please help me in this.

mdanner93 commented 3 years ago

I have the same problem and cannot manage to solve it. (See error below). Can anyone help with this? Many thanks in advance.

samples: 1

0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last): File "test.py", line 216, in main(cfg, args.gpu) File "test.py", line 142, in main res = test(segmentation_module, loader_test, gpu, using_in_memory) File "test.py", line 70, in test scores = async_copy_to(scores, gpu) File "data_parallel.py", line 15, in async_copy_to v = obj.cuda(dev, non_blocking=True) RuntimeError: CUDA out of memory. Tried to allocate 13.41 GiB (GPU 0; 6.00 GiB total capacity; 255.42 MiB already allocated; 4.52 GiB free; 286.00 MiB reserved in total by PyTorch) 0%| | 0/1 [00:10<?, ?it/s]

Process finished with exit code 1

GutlapalliNikhil commented 3 years ago

Hi @mdanner93 ,

In the config yaml files, what is the resolution you mentioned for test?

Try to reduce the resolution and try the same code. It will work.

mdanner93 commented 3 years ago

HI @GutlapalliNikhil,

thank you for your reply. Thats a good hint definitely will try that out. I wanted to try something similar in the beginning but couldn´t believe that one image will take up that much memory since its getting already resized by going through the pipeline. With "resolution you mentioned for test" in yaml file you are referring to the resolution set for Dataset I guess?

mdanner93 commented 3 years ago

ok problem is solved, got it to work. Thanks for your help!

ghost commented 3 years ago

How did you get it to work?

GutlapalliNikhil commented 3 years ago

@amitk000 , Try to reduce batch size, train resolution and test resolution.

avillalbacantero commented 3 years ago

@HLH13297997663 @GutlapalliNikhil @mdanner93 @amitk000 @quantombone

Hi guys,

I have had the same problem. If you don't want to reduce resolution nor batch size I have downloaded scores and pred_tmp from gpu to cpu before scores = scores + pred_tmp / len(cfg.DATASET.imgSizes). Here is this part of the code:

         with torch.no_grad():
          scores = torch.zeros(1, cfg.DATASET.num_class, segSize[0], segSize[1])
          #scores = async_copy_to(scores, gpu)  # comment this to avoid cuda out of memory

          for img in img_resized_list:
              feed_dict = batch_data.copy()
              feed_dict['img_data'] = img
              del feed_dict['img_ori']
              del feed_dict['info']
              feed_dict = async_copy_to(feed_dict, gpu)

              # forward pass
              pred_tmp = segmentation_module(feed_dict, segSize=segSize)

              # -- add this to avoid cuda out of memory
              pred_tmp = pred_tmp.cpu()
              # --

              scores = scores + pred_tmp / len(cfg.DATASET.imgSizes)

As the forward pass is done with feed_dict which is still in gpu, I think there won't be speed issues with that.

Regards, AnaVC

linqinxin-11 commented 3 years ago

I meet the same problem when test. Use the following code to solve this problem

!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/

TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do
echo "-------------------------------------------" echo $FDIR$i python3 -u test.py \ --imgs 'semantic_test_image/'$FDIR$i \ --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml \ DIR $MODEL_PATH \ TEST.result ./lsun_seg/ \ TEST.checkpoint epoch_20.pth

done

linqinxin-11 commented 3 years ago

I meet the same problem when test. Use the following code to solve this problem

!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/

TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do
echo "-------------------------------------------" echo $FDIR$i python3 -u test.py \ --imgs 'semantic_test_image/'$FDIR$i \ --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml \ DIR $MODEL_PATH \ TEST.result ./lsun_seg/ \ TEST.checkpoint epoch_20.pth

done

linqinxin-11 commented 3 years ago

I meet the same problem when test. Use the following code to solve this problem

!/bin/bash

Image and model names

MODEL_PATH=ade20k-resnet50dilated-ppm_deepsup RESULT_PATH=./111/

TEST_IMG= ./semantic_test_image/

ENCODER=$MODEL_PATH/encoder_epoch_20.pth DECODER=$MODEL_PATH/decoder_epoch_20.pth

Download model weights and image

if [ ! -e $MODEL_PATH ]; then mkdir $MODEL_PATH fi if [ ! -e $ENCODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$ENCODER fi if [ ! -e $DECODER ]; then wget -P $MODEL_PATH http://sceneparsing.csail.mit.edu/model/pytorch/$DECODER fi if [ ! -e $TEST_IMG ]; then wget -P $RESULT_PATH http://sceneparsing.csail.mit.edu/data/ADEChallengeData2016/images/validation/$TEST_IMG fi

dir=ls ./semantic_test_image/ #?FDIR=./semantic_test_image/ for i in $dir

do
echo "-------------------------------------------" echo $FDIR$i python3 -u test.py \ --imgs 'semantic_test_image/'$FDIR$i \ --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml \ DIR $MODEL_PATH \ TEST.result ./lsun_seg/ \ TEST.checkpoint epoch_20.pth

done

pranay-ar commented 2 years ago

@HLH13297997663 @GutlapalliNikhil @mdanner93 @amitk000 @quantombone

Hi guys,

I have had the same problem. If you don't want to reduce resolution nor batch size I have downloaded scores and pred_tmp from gpu to cpu before scores = scores + pred_tmp / len(cfg.DATASET.imgSizes). Here is this part of the code:

         with torch.no_grad():
          scores = torch.zeros(1, cfg.DATASET.num_class, segSize[0], segSize[1])
          #scores = async_copy_to(scores, gpu)  # comment this to avoid cuda out of memory

          for img in img_resized_list:
              feed_dict = batch_data.copy()
              feed_dict['img_data'] = img
              del feed_dict['img_ori']
              del feed_dict['info']
              feed_dict = async_copy_to(feed_dict, gpu)

              # forward pass
              pred_tmp = segmentation_module(feed_dict, segSize=segSize)

              # -- add this to avoid cuda out of memory
              pred_tmp = pred_tmp.cpu()
              # --

              scores = scores + pred_tmp / len(cfg.DATASET.imgSizes)

As the forward pass is done with feed_dict which is still in gpu, I think there won't be speed issues with that.

Regards, AnaVC

Thank you @avillalbacantero. Works like a charm!