open-mmlab / mmrotate

OpenMMLab Rotated Object Detection Toolbox and Benchmark
https://mmrotate.readthedocs.io/en/latest/
Apache License 2.0
1.87k stars 552 forks source link

[Bug] H2rbox_head中_predict_by_feat_single()函数的最后一行 #767

Closed qqq1521902442 closed 1 year ago

qqq1521902442 commented 1 year ago

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

1.x branch https://github.com/open-mmlab/mmrotate/tree/1.x

Environment

sys.platform: linux Python: 3.9.16 (main, Jan 11 2023, 16:05:54) [GCC 11.2.0] CUDA available: True numpy_random_seed: 33594978 GPU 0,1,2,3: GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.0, V11.0.221 GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.8.0 PyTorch compiling details: PyTorch built with:

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None Distributed launcher: none Distributed training: False GPU number: 1

Reproduces the problem - code sample

base = [ '../base/datasets/dv2_demo.py', '../base/schedules/schedule_try.py', '../base/default_runtime.py' ] angle_version = 'le90'

model settings

model = dict( type='TSH2RBoxDetector', crop_size=(840, 712), data_preprocessor=dict( type='mmdet.DetDataPreprocessorTS', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], bgr_to_rgb=True, pad_size_divisor=32, boxtype2tensor=False), backbone_r=dict( type='mmdet.ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), backbone_i=dict( type='mmdet.ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), neck_r=dict( type='mmdet.FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5, relu_before_extra_convs=True), neck_i=dict( type='mmdet.FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5, relu_before_extra_convs=True), bbox_head=dict( type='H2RBoxHead', num_classes=5, in_channels=256, angle_version='le90', stacked_convs=4, feat_channels=256, strides=[8, 16, 32, 64, 128], center_sampling=True, center_sample_radius=1.5, norm_on_bbox=True, centerness_on_reg=True, use_hbbox_loss=False, scale_angle=True, bbox_coder=dict( type='DistanceAnglePointCoder', angle_version=angle_version), loss_cls=dict( type='mmdet.FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict(type='mmdet.IoULoss', loss_weight=1.0), loss_centerness=dict( type='mmdet.CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), crop_size=(512, 512), loss_bbox_ss=dict( type='H2RBoxConsistencyLoss', loss_weight=0.4, center_loss_cfg=dict(type='mmdet.L1Loss', loss_weight=0.0), shape_loss_cfg=dict(type='mmdet.IoULoss', loss_weight=1.0), angle_loss_cfg=dict(type='mmdet.L1Loss', loss_weight=1.0))),

training and testing settings

train_cfg=None,
test_cfg=dict(
    nms_pre=2000,
    min_bbox_size=0,
    score_thr=0.05,
    nms=dict(type='nms_rotated', iou_threshold=0.1),
    max_per_img=2000))

load hbox annotations

train_pipeline = [ dict( type='mmdet.LoadImageFromFileTS', file_client_args={{base.file_client_args}}), dict(type='mmdet.LoadAnnotations', with_bbox=True, box_type='qbox'),

Horizontal GTBox, (x1,y1,x2,y2)

dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='hbox')),
# Horizontal GTBox, (x,y,w,h,theta)
dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='rbox')),
dict(type='mmdet.ResizeTS', scale=(840, 712), keep_ratio=True),
dict(
    type='mmdet.RandomFlipTS',
    prob=0.75,
    direction=['horizontal', 'vertical', 'diagonal']),
dict(type='mmdet.PackDetInputsTS')

]

train_dataloader = dict(dataset=dict(pipeline=train_pipeline))

optimizer

optim_wrapper = dict( optimizer=dict( delete=True, type='AdamW', lr=0.0001, betas=(0.9, 0.999), weight_decay=0.05))

train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=1)

Reproduces the problem - command or script

使用tools/train.py运行h2rbox后

Reproduces the problem - error message

File "/data/mmrotate-1.x/mmrotate/models/dense_heads/h2rbox_head.py", line 730, in _predict_by_feat_single results.bboxes = RotatedBoxes(bboxes) File "/data/anaconda3/envs/new_mmrotate/lib/python3.9/site-packages/mmengine/structures/instance_data.py", line 138, in setattr assert len(value) == len(self), 'The length of ' \ AssertionError: The length of values 2657 is not consistent with the length of this :obj:InstanceData 204

Additional information

https://github.com/open-mmlab/mmrotate/blob/5d0491c826ad52b7490a7c1146fb65a7d78e2347/mmrotate/models/dense_heads/h2rbox_head.py

ccd87c4e6c689f2400990dee2a031d4

其中的706行,results.bboxes=results.bboxes = RotatedBoxes(bboxes),由于我的数据集没有正方形,删去配置文件中判定后,我怀疑传入了nms之前的boxes。 对此问题,我额外增加了一次缩进,让该语句放在if语句中。后面似乎不会出现这个问题,同时精度没有改变?我想知道这样做是否有其他隐患。

mm-assistant[bot] commented 1 year ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

qqq1521902442 commented 1 year ago

补充:以下是我进行修改后运行h2rbox-le90_r50_fpn_adamw-1x_dota.py的训练结果,我没有进行额外的对比,但看起来效果没有改变。 2023/03/10 03:16:44 - mmengine - INFO - +--------------------+-------+--------+--------+-------+ | class | gts | dets | recall | ap | +--------------------+-------+--------+--------+-------+ | plane | 18788 | 46109 | 0.974 | 0.905 | | baseball-diamond | 1087 | 12759 | 0.929 | 0.806 | | bridge | 4183 | 61179 | 0.746 | 0.536 | | ground-track-field | 733 | 14390 | 0.809 | 0.653 | | small-vehicle | 58868 | 248036 | 0.894 | 0.768 | | large-vehicle | 43075 | 234209 | 0.889 | 0.781 | | ship | 76153 | 162173 | 0.899 | 0.798 | | tennis-court | 5923 | 19478 | 0.985 | 0.908 | | basketball-court | 1180 | 7714 | 0.981 | 0.885 | | storage-tank | 13670 | 69764 | 0.846 | 0.780 | | soccer-ball-field | 827 | 14492 | 0.913 | 0.754 | | roundabout | 973 | 18005 | 0.913 | 0.761 | | harbor | 15468 | 76300 | 0.766 | 0.637 | | swimming-pool | 3836 | 15439 | 0.943 | 0.842 | | helicopter | 1189 | 17827 | 0.944 | 0.860 | +--------------------+-------+--------+--------+-------+ | mAP | | | | 0.778 | +--------------------+-------+--------+--------+-------+ 2023/03/10 03:16:44 - mmengine - INFO - Epoch(val) [12][21046/21046] dota/mAP: 0.7782 dota/AP50: 0.7780

zytx121 commented 1 year ago

Hi @qqq1521902442! You are right. If there are no square classes in your dataset, it's OK to delete this part.

qqq1521902442 commented 1 year ago

Hi @qqq1521902442! You are right. If there are no square classes in your dataset, it's OK to delete this part.

您好@qqq1521902442! 您是对的。如果您的数据集中没有正方形类,可以删除此部分。

thank you , i get it