Thanks for your error report and we appreciate it a lot.
Checklist
I have searched related issues but cannot get the expected help.
The bug has not been fixed in the latest version.
Describe the bug
I'm trying to train MS-RCNN on COCO, I'm using the latest version.
Single-machine-multiple-gpu training is happening successfully. But during validation I'm facing the mentioned error.
You may add addition that may be helpful for locating the problem, such as
I installed everything as per the install.md script
Error traceback
tools/train.py", line 159, in <module>
main()
File "tools/train.py", line 155, in main
meta=meta)
File "/home/dksingh/sandbox/mmdetection/mmdet/apis/train.py", line 165, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 383, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 292, in train
self.call_hook('after_train_epoch')
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 245, in call_hook
getattr(hook, fn_name)(self)
File "/home/dksingh/sandbox/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 27, in after_train_epoch
results = single_gpu_test(runner.model, self.dataloader, show=False)
File "/home/dksingh/sandbox/mmdetection/mmdet/apis/test.py", line 48, in single_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward
return self.gather(outputs, self.output_device)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
res = gather_map(outputs)
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/dksingh/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: expected sequence object with len >= 0 or a single integer
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug I'm trying to train MS-RCNN on COCO, I'm using the latest version. Single-machine-multiple-gpu training is happening successfully. But during validation I'm facing the mentioned error.
Reproduction
Environment 2020-05-16 09:34:55,031 - mmdet - INFO - Environment info:
Error traceback
Extra Info once I start the training: