OOM - Githubissues

x[i] = self.branches[i](x[i])

File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/ai/lixinagss/IIM/model/HR_Net/seg_hrnet.py", line 50, in forward out = self.bn1(out) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward self.weight, self.bias, bn_training, exponential_average_factor, self.eps) File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2016, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 1108) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.

I train the NWPU,the batch is 1, __C_NWPU.TRAIN_BATCH_SIZE = 12 #imgs still OOM. on one 11G ..

taohan10200 / IIM

OOM #12