LikeLy-Journey / SegmenTron

Support PointRend, Fast_SCNN, HRNet, Deeplabv3_plus(xception, resnet, mobilenet), ContextNet, FPENet, DABNet, EdaNet, ENet, Espnetv2, RefineNet, UNet, DANet, HRNet, DFANet, HardNet, LedNet, OCNet, EncNet, DuNet, CGNet, CCNet, BiSeNet, PSPNet, ICNet, FCN, deeplab)
Apache License 2.0
705 stars 162 forks source link

DeepLabV3Plus flops: 458.505G input shape is [3, 1024, 2048], params: 47.737M #25

Closed qiaoD closed 4 years ago

qiaoD commented 4 years ago

Hi.Thank u for your code. When I run the code, It is wrong,but I don't know why.Thank you for your help.: 2020-02-29 09:16:37,190 Segmentron INFO: 2020-02-29 09:16:51,159 Segmentron INFO: DeepLabV3Plus flops: 458.505G input shape is [3, 1024, 2048], params: 47.737M 2020-02-29 09:16:51,161 Segmentron INFO: Not use SyncBatchNorm! 2020-02-29 09:16:51,163 Segmentron INFO: Start training, Total Epochs: 400 = Total Iterations 74000 2020-02-29 09:16:57,141 Segmentron INFO: DeepLabV3Plus flops: 458.505G input shape is [3, 1024, 2048], params: 47.737M 2020-02-29 09:16:57,143 Segmentron INFO: Not use SyncBatchNorm! 2020-02-29 09:16:57,145 Segmentron INFO: Start training, Total Epochs: 400 = Total Iterations 74000 2020-02-29 09:17:02,040 Segmentron INFO: DeepLabV3Plus flops: 458.505G input shape is [3, 1024, 2048], params: 47.737M 2020-02-29 09:17:02,042 Segmentron INFO: Not use SyncBatchNorm! 2020-02-29 09:17:02,044 Segmentron INFO: Start training, Total Epochs: 400 = Total Iterations 74000 2020-02-29 09:17:02,214 Segmentron INFO: DeepLabV3Plus flops: 458.505G input shape is [3, 1024, 2048], params: 47.737M 2020-02-29 09:17:02,216 Segmentron INFO: Not use SyncBatchNorm! 2020-02-29 09:17:02,218 Segmentron INFO: Start training, Total Epochs: 400 = Total Iterations 74000 Traceback (most recent call last): File "./tools/train.py", line 221, in trainer.train() File "./tools/train.py", line 133, in train outputs = self.model(images) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, kwargs) File "/root/seg/segmentron/models/deeplabv3_plus.py", line 38, in forward x = self.head(c4, c1) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/root/seg/segmentron/models/deeplabv3_plus.py", line 69, in forward x = self.aspp(x) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, kwargs) File "/root/seg/segmentron/modules/module.py", line 66, in forward x0 = self.aspp0(x) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(input, **kwargs) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 345, in forward return self.conv2d_forward(input, self.weight) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward self.padding, self.dilation, self.groups) File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 8737) is killed by signal: Killed. Traceback (most recent call last): File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.5/dist-packages/torch/distributed/launch.py", line 263, in main() File "/usr/local/lib/python3.5/dist-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', './tools/train.py', '--local_rank=3', '--config-file', 'configs/cityscapes_deeplabv3_plus_resnet.yaml']' returned non-zero exit status -9

LikeLy-Journey commented 4 years ago

it seems something wrong with your dataloader. You can try to train with one gpu, and set num_workers=0 to see if there are something wrong. https://github.com/LikeLy-Journey/SegmenTron/blob/master/tools/train.py#L55