File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, *kwargs)
File "/ai/lixinagss/IIM/model/HR_Net/seg_hrnet.py", line 50, in forward
out = self.bn1(out)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(input, kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward
self.weight, self.bias, bn_training, exponential_average_factor, self.eps)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2016, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1108) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.
I train the NWPU,the batch is 1, __C_NWPU.TRAIN_BATCH_SIZE = 12 #imgs
still OOM. on one 11G ..
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/ai/lixinagss/IIM/model/HR_Net/seg_hrnet.py", line 50, in forward out = self.bn1(out) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward self.weight, self.bias, bn_training, exponential_average_factor, self.eps) File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2016, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 1108) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.
I train the NWPU,the batch is 1, __C_NWPU.TRAIN_BATCH_SIZE = 12 #imgs still OOM. on one 11G ..