An error when training with free-anchor

jiaminglei-lei commented 2 years ago

when trained a model with the config: hv_pointpillars_fpn_sbn-all_free-anchor_4x8_2x_nus-3d.py, encountered an error.

Traceback (most recent call last): File "/home/.../.pycharm_helpers/pydev/pydevd.py", line 1477, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/home/.../.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/.../mmdetection3d/tools/train.py", line 226, in main() File "/home/.../mmdetection3d/tools/train.py", line 222, in main meta=meta) File "/home/.../mmdetection3d/mmdet3d/apis/train.py", line 35, in train_model meta=meta) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmdet/apis/train.py", line 170, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run python-BaseException epoch_runner(data_loaders[i], kwargs) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True, kwargs) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter kwargs) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step return self.module.train_step(inputs[0], kwargs[0]) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 237, in train_step losses = self(data) File "/home/..../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, kwargs) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func return old_func(args, kwargs) File "/home/.../mmdetection3d/mmdet3d/models/detectors/base.py", line 59, in forward return self.forward_train(kwargs) File "/home/.../mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py", line 279, in forward_train gt_bboxes_ignore) File "/home/.../mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py", line 316, in forward_pts_train loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore) File "/home/.../anaconda3/envs/open-mmlab/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func return old_func(*args, kwargs) File "/home/.../mmdetection3d/mmdet3d/models/dense_heads/free_anchor3d_head.py", line 212, in loss bbox_preds_tmp, matched_object_targets_tmp RuntimeError: Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.**

In other models like pointpillar with anchor_3d_head, this problem will not happen. Have any idea to solve this problem?

ZCMax commented 2 years ago

Have you made any modifications in your code free_anchor3d_head.py?

jiaminglei-lei commented 2 years ago

no.

jiaminglei-lei commented 2 years ago

I added a clone() in this if block, and it works. FROM https://github.com/open-mmlab/mmdetection3d/blob/9c7270d00dbdd0599b6b6bf816c3ff2dd17d4878/mmdet3d/models/dense_heads/free_anchor3d_head.py#L206-L209 TO And replace all bbox_preds_ with bbox_preds_clone. I am testing the performance, checking whether it will hurt the accuracy.

Updata: with batch_size=3, other config keep unchange, I got the mAP=0.4433 and NDS=0.5515.

clw5180 commented 1 month ago

jiaminglei-lei

老哥牛批

open-mmlab / mmdetection3d

An error when training with free-anchor #1244