feihuzhang / GANet

GA-Net: Guided Aggregation Net for End-to-end Stereo Matching
MIT License
554 stars 135 forks source link

Just run a single SGA layer, the CUDA is out of memory #16

Closed youmi-zym closed 5 years ago

youmi-zym commented 5 years ago

My system info: Python 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56) PyTorch version: 1.2.0a0+45b91bd Is debug build: No CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.4 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609 CMake version: version 3.12.2

Python version: 3.5 Is CUDA available: Yes CUDA runtime version: 10.0.130 GPU models and configuration: GPU 0: GeForce GTX 1080 Ti GPU 1: GeForce GTX 1080 Ti GPU 2: GeForce GTX 1080 Ti

Nvidia driver version: 410.48 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.0.5

Versions of relevant libraries: [pip3] numpy==1.15.2 [pip3] torch==1.2.0a0+45b91bd [pip3] torchvision==0.3.0

Here is the error message:

Namespace(batchSize=1, crop_height=240, crop_width=528, cuda=1, data_path='/home/youmin/data/StereoMatching/SceneFlow/', kitti=0, kitti2015=0, left_right=0, lr=0.001, max_disp=192, nEpochs=11, resume='', save_path='/home/youmin/exps/GANet/clean-test', seed=123, shift=0, testBatchSize=1, threads=16, training_list='/home/youmin/data/annotations/SceneFlow/train.json', val_list='/home/youmin/data/annotations/SceneFlow/test.json') ===> Loading datasets ===> Building model 0.001 Traceback (most recent call last): File "train.py", line 184, in train(epoch) File "train.py", line 105, in train disp0, disp1, disp2=model(input1,input2) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(*input, kwargs) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward return self.module(*inputs[0], *kwargs[0]) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(input, kwargs) File "/home/youmin/projects/depth/GANet/models/GANet_deep.py", line 402, in forward disp0, disp1, disp2 = self.cost_agg(x, g) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(*input, kwargs) File "/home/youmin/projects/depth/GANet/models/GANet_deep.py", line 321, in forward x = self.sga1(x, g['sg1']) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(*input, *kwargs) File "/home/youmin/projects/depth/GANet/models/GANet_deep.py", line 272, in forward x = self.conv_refine(x) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(input, kwargs) File "/home/youmin/projects/depth/GANet/models/GANet_deep.py", line 36, in forward x = self.conv(x) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 525, in call result = self.forward(*input, **kwargs) File "/node01/jobs/io/env/stereo_env/anaconda3/envs/GANet/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 483, in forward self.padding, self.dilation, self.groups) RuntimeError: CUDA out of memory. Tried to allocate 2.95 GiB (GPU 0; 11.91 GiB total capacity; 8.74 GiB already allocated; 2.69 GiB free; 47.56 MiB cached)

youmi-zym commented 5 years ago

Training requires 12G ! ! my fault. Thanks.