Would you mind help with this issue below, I'm not quite familiar with detectron.
While I tried to run exp on coco2017 dataset with train_detr.py code and detr_256_6_6_regnetx_0.4g.yaml confile file, a error occurred in the init process.
ERROR [03/08 14:10:37 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 149, in train
self.run_step()
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 273, in run_step
loss_dict = self.model(data)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/code/yolov7/yolov7/modeling/meta_arch/detr.py", line 165, in forward
output = self.detr(images)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/code/yolov7/yolov7/modeling/meta_arch/detr.py", line 449, in forward
features, pos = self.backbone(samples)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/code/yolov7/yolov7/modeling/backbone/detr_backbone.py", line 504, in forward
xs = self[0](tensor_list)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/code/yolov7/yolov7/modeling/meta_arch/detr.py", line 356, in forward
features = self.backbone(images.tensor)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/detectron2/modeling/backbone/regnet.py", line 315, in forward
x = self.stem(x)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/detectron2/modeling/backbone/regnet.py", line 87, in forward
x = layer(x)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 732, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 845, in get_world_size
return _get_group_size(group)
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 306, in _get_group_size
default_pg = _get_default_group()
File "/home/yuxin/miniconda3/envs/yolov7/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 411, in _get_default_group
"Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
Hi jinfagang,
Thanks for your amazing contribution.
Would you mind help with this issue below, I'm not quite familiar with detectron.
While I tried to run exp on coco2017 dataset with train_detr.py code and detr_256_6_6_regnetx_0.4g.yaml confile file, a error occurred in the init process.
Can you show some hint how to fix this issue?
Thanks, Yuxin.