realgump / MvMHAT

MvMHAT: Self-supervised Multi-view Multi-Human Association and Tracking (ACM MM 2021, Oral Paper)
39 stars 8 forks source link

CUDA out of memory on a RTX 2080 Ti #2

Closed jpsml closed 3 years ago

jpsml commented 3 years ago

Hello, I have tried to run your code using a RTX 2080 Ti GPU, but got a CUDA out of memory error, as shown below:

model: model loss: pairwise triplewise lr: 1e-05 network: resnet 1%|█▊ | 2/180 [00:05<07:57, 2.68s/it] Traceback (most recent call last): File "train.py", line 66, in epoch_loss = train(epoch_i) File "train.py", line 22, in train feature = model(img.squeeze(0).cuda()) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torchvision/models/resnet.py", line 220, in forward return self._forward_impl(x) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torchvision/models/resnet.py", line 210, in _forward_impl x = self.layer3(x) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torchvision/models/resnet.py", line 116, in forward identity = self.downsample(x) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, **kwargs) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward self.weight, self.bias, bn_training, exponential_average_factor, self.eps) File "/home/voxarlabs/anaconda3/envs/MVMHAT/lib/python3.6/site-packages/torch/nn/functional.py", line 2058, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 10.76 GiB total capacity; 9.11 GiB already allocated; 62.00 MiB free; 9.18 GiB reserved in total by PyTorch)

I am using Ubuntu 18.04. Do you have any suggestion for solving this problem? Best regards.

jpsml commented 3 years ago

I was using 7 views, I have changed to use only 4 views and the error was gone.