There are several common situations in the reimplementation issues as below
Reimplement a model in the model zoo using the provided configs
Reimplement a model in the model zoo on other dataset (e.g., custom datasets)
Reimplement a custom model but all the components are implemented in MMDetection3D
Reimplement a custom model with new modules implemented by yourself
There are several things to do for different cases as below.
For case 1 & 3, please follow the steps in the following sections thus we could help to quick identify the issue.
For case 2 & 4, please understand that we are not able to do much help here because we usually do not know the full code and the users should be responsible to the code they write.
One suggestion for case 2 & 4 is that the users should first check whether the bug lies in the self-implemted code or the original code. For example, users can first make sure that the same model runs well on supported datasets. If you still need help, please describe what you have done and what you obtain in the issue, and follow the steps in the following sections and try as clear as possible so that we can better help you.
Checklist
I have searched related issues but cannot get the expected help.
The issue has not been fixed in the latest version.
Describe the issue
2021-05-29 20:12:34,792 - mmdet - INFO - workflow: [('train', 1)], max: 80 epochs
Traceback (most recent call last):
File "tools/train.py", line 222, in
main()
File "tools/train.py", line 218, in main
meta=meta)
File "/home/qf/mm/mmdetection3d/mmdet3d/apis/train.py", line 34, in train_model
meta=meta)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/datasets/dataset_wrappers.py", line 151, in getitem
return self.dataset[idx % self._ori_len]
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 378, in getitem
data = self.prepare_train_data(idx)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 153, in prepare_train_data
example = self.pipeline(input_dict)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/datasets/pipelines/compose.py", line 40, in call
data = t(data)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/transforms_3d.py", line 231, in call
gt_bboxes_3d.tensor.numpy(), gt_labels_3d, img=None)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/dbsampler.py", line 228, in sample_all
avoid_coll_boxes)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/dbsampler.py", line 305, in sample_class_v2
coll_mat = data_augment_utils.box_collision_test(total_bv, totalbv)
TypeError: expected dtype object, got 'numpy.dtype[bool]'
Did you make any modifications on the code or config? Did you understand what you have modified?
What dataset did you use?
kitti
Environment
Please run python mmdet3d/utils/collect_env.py to collect necessary environment infomation and paste it here.
fatal: Not a git repository (or any of the parent directories): .git
sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
GPU 0,1,2,3: GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.7.1
PyTorch compiling details: PyTorch built with:
GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
TorchVision: 0.8.2
OpenCV: 4.5.2
MMCV: 1.3.5
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: not available
MMDetection: 2.11.0
MMSegmentation: 0.13.0
MMDetection3D: 0.13.0+
You may add addition that may be helpful for locating the problem, such as
How you installed PyTorch [e.g., pip, conda, source]
Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)
Results
If applicable, paste the related results here, e.g., what you expect and what you get.
A placeholder for results comparison
Issue fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
Notice
There are several common situations in the reimplementation issues as below
There are several things to do for different cases as below.
Checklist
Describe the issue
2021-05-29 20:12:34,792 - mmdet - INFO - workflow: [('train', 1)], max: 80 epochs Traceback (most recent call last): File "tools/train.py", line 222, in
main()
File "tools/train.py", line 218, in main
meta=meta)
File "/home/qf/mm/mmdetection3d/mmdet3d/apis/train.py", line 34, in train_model
meta=meta)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/datasets/dataset_wrappers.py", line 151, in getitem
return self.dataset[idx % self._ori_len]
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 378, in getitem
data = self.prepare_train_data(idx)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/custom_3d.py", line 153, in prepare_train_data
example = self.pipeline(input_dict)
File "/home/qf/.conda/envs/mmlab/lib/python3.7/site-packages/mmdet/datasets/pipelines/compose.py", line 40, in call
data = t(data)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/transforms_3d.py", line 231, in call
gt_bboxes_3d.tensor.numpy(), gt_labels_3d, img=None)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/dbsampler.py", line 228, in sample_all
avoid_coll_boxes)
File "/home/qf/mm/mmdetection3d/mmdet3d/datasets/pipelines/dbsampler.py", line 305, in sample_class_v2
coll_mat = data_augment_utils.box_collision_test(total_bv, totalbv)
TypeError: expected dtype object, got 'numpy.dtype[bool]'
Reproduction
Did you make any modifications on the code or config? Did you understand what you have modified?
What dataset did you use? kitti Environment
Please run
python mmdet3d/utils/collect_env.py
to collect necessary environment infomation and paste it here.fatal: Not a git repository (or any of the parent directories): .git sys.platform: linux Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0] CUDA available: True GPU 0,1,2,3: GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.0, V10.0.130 GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.7.1 PyTorch compiling details: PyTorch built with:
TorchVision: 0.8.2 OpenCV: 4.5.2 MMCV: 1.3.5 MMCV Compiler: GCC 5.4 MMCV CUDA Compiler: not available MMDetection: 2.11.0 MMSegmentation: 0.13.0 MMDetection3D: 0.13.0+
$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)Results
If applicable, paste the related results here, e.g., what you expect and what you get.
Issue fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!