open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.7k stars 9.48k forks source link

softteacher in 3.x 运行报错 #9008

Closed wjm202 closed 2 years ago

wjm202 commented 2 years ago

Prerequisite

🐞 Describe the bug

在运行 3.x上的 softeacher 代码遇到问题,coco数据集,除了设置init_cfg=None没有更改任何config配置 运行命令: bash tools/dist_train.sh configs/soft_teacher/soft-teacher_faster-rcnn_r50-caffe_fpn_180k_semi-0.1-coco.py 1

fe9b6739296e4e7e101a4556a401aec4

Environment

python 3.8.5 cuda 10.1 cudnn 7.6 mmcv 2.0.0rc1 mmcv-full 1.6.2 mmdeploy 0.8.0 mmdeploy-python 0.8.0 mmdet 3.0.0rc1 /SSOD/softteacher-mmcv/mmdetection mmengine 0.1.0 torch 1.7.1+cu101 torchaudio 0.7.2 torchvision 0.8.2+cu101

Additional information

base = [ '../base/models/faster-rcnn_r50_fpn.py', '../base/default_runtime.py', '../base/datasets/semi_coco_detection.py' ]

detector = base.model detector.data_preprocessor = dict( type='DetDataPreprocessor', mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], bgr_to_rgb=False, pad_size_divisor=32) detector.backbone = dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='caffe', init_cfg=None)

model = dict( delete=True, type='SoftTeacher', detector=detector, data_preprocessor=dict( type='MultiBranchDataPreprocessor', data_preprocessor=detector.data_preprocessor), semi_train_cfg=dict( freeze_teacher=True, sup_weight=1.0, unsup_weight=4.0, pseudo_label_initial_score_thr=0.5, rpn_pseudo_thr=0.9, cls_pseudo_thr=0.9, reg_pseudo_thr=0.02, jitter_times=10, jitter_scale=0.06, min_pseudo_bbox_wh=(1e-2, 1e-2)), semi_test_cfg=dict(predict_on='teacher'))

10% coco train2017 is set as labeled dataset

labeled_dataset = base.labeled_dataset unlabeled_dataset = base.unlabeled_dataset labeled_dataset.ann_file = 'semi_anns/instances_train2017.1@10.json' unlabeled_dataset.ann_file = 'semi_anns/' \ 'instances_train2017.1@10-unlabeled.json' unlabeled_dataset.data_prefix = dict(img='train2017/') train_dataloader = dict( dataset=dict(datasets=[labeled_dataset, unlabeled_dataset]))

training schedule for 180k

train_cfg = dict( type='IterBasedTrainLoop', max_iters=180000, val_interval=5000) val_cfg = dict(type='TeacherStudentValLoop') test_cfg = dict(type='TestLoop')

learning rate policy

param_scheduler = [ dict( type='LinearLR', start_factor=0.001, by_epoch=False, begin=0, end=500), dict( type='MultiStepLR', begin=0, end=180000, by_epoch=False, milestones=[120000, 160000], gamma=0.1) ]

optimizer

optim_wrapper = dict( type='OptimWrapper', optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))

default_hooks = dict( checkpoint=dict(by_epoch=False, interval=10000, max_keep_ckpts=2)) log_processor = dict(by_epoch=False)

custom_hooks = [dict(type='MeanTeacherHook')]

mm-assistant[bot] commented 2 years ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

Czm369 commented 2 years ago

So init_cfg=dict(type='Pretrained', checkpoint='open-mmlab://detectron2/resnet50_caffe') is ok?

wjm202 commented 2 years ago

a4dc63b966e9f5c05652f8c8780166f4 it dosen't work

Czm369 commented 2 years ago

It is a bug that the new feature boxlist does not support empty box.

jbwang1997 commented 2 years ago

Seems the error report is not complete. Could you provide the whole error report?

zhaogev5 commented 2 years ago

I have the same problem and don‘t know how to fix it,plz help

zhaogev5 commented 2 years ago
 File "./tools/train.py", line 120, in <module>
    main()
  File "./tools/train.py", line 116, in main
    runner.train()
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1631, in train
    model = self.train_loop.run()  # type: ignore
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 259, in run
    data_batch = next(self.dataloader_iterator)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 156, in __next__
    data = next(self._iterator)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
    data.reraise()
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/dataset_wrapper.py", line 131, in __getitem__
    return self.datasets[dataset_idx][sample_idx]
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 408, in __getitem__
    data = self.prepare_data(idx)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 789, in prepare_data
    return self.pipeline(data_info)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 58, in __call__
    data = t(data)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/datasets/transforms/wrappers.py", line 119, in transform
    branch_results = pipeline(copy.deepcopy(results))
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/wrappers.py", line 87, in transform
    results = t(results)  # type: ignore
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/datasets/transforms/wrappers.py", line 161, in transform
    results = t(results)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/datasets/transforms/augment_wrappers.py", line 257, in transform
    results = self.transforms[idx](results)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/wrappers.py", line 87, in transform
    results = t(results)  # type: ignore
  File "/home/ymhj/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 11, in __call__
    return self.transform(results)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/structures/bbox/box_type.py", line 277, in wrapper
    _results = func(self, results, *args, **kwargs)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/datasets/transforms/geometric.py", line 179, in transform
    self._transform_bboxes(results, mag)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/datasets/transforms/geometric.py", line 138, in _transform_bboxes
    results['gt_bboxes'].project_(self.homography_matrix)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/structures/bbox/horizontal_boxes.py", line 203, in project_
    self.tensor = self.corner2hbox(corners)
  File "/media/ymhj/新加卷/lyz/2021_down/bishe/mmd_3/mmdetection/mmdet/structures/bbox/horizontal_boxes.py", line 233, in corner2hbox
    min_xy = corners.min(dim=-2)[0]
RuntimeError: cannot perform reduction function min on tensor with no elements because the operation does not have an identity
ZwwWayne commented 2 years ago

fixed in https://github.com/open-mmlab/mmdetection/issues/9121. So this issue is closed.