open-mmlab / mmrotate

OpenMMLab Rotated Object Detection Toolbox and Benchmark
https://mmrotate.readthedocs.io/en/latest/
Apache License 2.0
1.88k stars 556 forks source link

HRSC KLD data reproduction #817

Open Kn-Oh opened 1 year ago

Kn-Oh commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

(mmroate) ➜ mmrotate git:(qd) ✗ python mmrotate/utils/collect_env.py sys.platform: linux Python: 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: Tesla V100-SXM2-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.4, V11.4.152 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.10.0+cu111 PyTorch compiling details: PyTorch built with:

TorchVision: 0.11.0+cu111 OpenCV: 4.6.0 MMEngine: 0.4.0 MMRotate: 1.0.0rc0+3db9cc0

Reproduces the problem - code sample

dataset_type = 'HRSCDataset' data_root = '/root/autodl-tmp/HRSC/HRSC2016/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(800, 512)), dict( type='RRandomFlip', flip_ratio=[0.25, 0.25, 0.25], direction=['horizontal', 'vertical', 'diagonal'], version='le90'), dict( type='PolyRandomRotate', rotate_ratio=0.5, angles_range=180, auto_bound=False, rect_classes=[9, 11], version='le90'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='HRSCDataset', classwise=False, ann_file='/root/autodl-tmp/HRSC/HRSC2016/ImageSets/trainval.txt', ann_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/Annotations/', img_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', img_prefix='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(800, 512)), dict( type='RRandomFlip', flip_ratio=[0.25, 0.25, 0.25], direction=['horizontal', 'vertical', 'diagonal'], version='le90'), dict( type='PolyRandomRotate', rotate_ratio=0.5, angles_range=180, auto_bound=False, rect_classes=[9, 11], version='le90'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], version='le90'), val=dict( type='HRSCDataset', classwise=False, ann_file='/root/autodl-tmp/HRSC/HRSC2016/ImageSets/test.txt', ann_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/Annotations/', img_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', img_prefix='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ], version='le90'), test=dict( type='HRSCDataset', classwise=False, ann_file='/root/autodl-tmp/HRSC/HRSC2016/ImageSets/test.txt', ann_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/Annotations/', img_subdir='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', img_prefix='/root/autodl-tmp/HRSC/HRSC2016/FullDataSet/AllImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ], version='le90')) evaluation = dict(interval=10, metric='mAP') optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[48, 66]) runner = dict(type='EpochBasedRunner', max_epochs=100) checkpoint_config = dict(interval=36) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] angle_version = 'le90' model = dict( type='RotatedRetinaNet', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, zero_init_residual=False, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_input', num_outs=5), bbox_head=dict( type='RotatedRetinaHead', num_classes=1, in_channels=256, stacked_convs=4, feat_channels=256, assign_by_circumhbbox=None, anchor_generator=dict( type='RotatedAnchorGenerator', octave_base_scale=4, scales_per_octave=3, ratios=[1.0, 0.5, 2.0], strides=[8, 16, 32, 64, 128]), bbox_coder=dict( type='DeltaXYWHAOBBoxCoder', angle_range='le90', norm_factor=None, edge_swap=True, proj_xy=True, target_means=(0.0, 0.0, 0.0, 0.0, 0.0), target_stds=(1.0, 1.0, 1.0, 1.0, 1.0)), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict( type='GDLoss_v1', loss_type='kld', fun='log1p', tau=1, loss_weight=1.0), reg_decoded_bbox=True), train_cfg=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.4, min_pos_iou=0, ignore_iof_thr=-1, iou_calculator=dict(type='RBboxOverlaps2D')), allowed_border=-1, pos_weight=-1, debug=False), test_cfg=dict( nms_pre=2000, min_bbox_size=0, score_thr=0.05, nms=dict(iou_thr=0.1), max_per_img=2000)) work_dir = '/root/autodl-tmp/hrsc_kld' auto_resume = False gpu_ids = range(0, 4)

Reproduces the problem - command or script

cd mmrotate cd tools ./dist_train.sh

Reproduces the problem - error message

Epoch(val) [100][114] mAP: 0.4531, AP50: 0.8550, AP55: 0.8490, AP60: 0.8210, AP65: 0.7230, AP70: 0.6000, AP75: 0.3960, AP80: 0.2180, AP85: 0.0530, AP90: 0.0150, AP95: 0.0010 +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.967 | 0.855 | +-------+------+------+--------+-------+ | mAP | | | | 0.855 | +-------+------+------+--------+-------+ 2023-04-12 11:43:39,065 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.954 | 0.849 | +-------+------+------+--------+-------+ | mAP | | | | 0.849 | +-------+------+------+--------+-------+ 2023-04-12 11:43:42,773 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.910 | 0.821 | +-------+------+------+--------+-------+ | mAP | | | | 0.821 | +-------+------+------+--------+-------+ 2023-04-12 11:43:46,586 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.836 | 0.723 | +-------+------+------+--------+-------+ | mAP | | | | 0.723 | +-------+------+------+--------+-------+ 2023-04-12 11:43:50,243 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.724 | 0.600 | +-------+------+------+--------+-------+ | mAP | | | | 0.600 | +-------+------+------+--------+-------+ 2023-04-12 11:43:53,966 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.567 | 0.396 | +-------+------+------+--------+-------+ | mAP | | | | 0.396 | +-------+------+------+--------+-------+ 2023-04-12 11:43:57,629 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.354 | 0.218 | +-------+------+------+--------+-------+ | mAP | | | | 0.218 | +-------+------+------+--------+-------+ 2023-04-12 11:44:01,451 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.150 | 0.053 | +-------+------+------+--------+-------+ | mAP | | | | 0.053 | +-------+------+------+--------+-------+ 2023-04-12 11:44:05,155 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.024 | 0.015 | +-------+------+------+--------+-------+ | mAP | | | | 0.015 | +-------+------+------+--------+-------+ 2023-04-12 11:44:08,842 - mmrotate - INFO - +-------+------+------+--------+-------+ | class | gts | dets | recall | ap | +-------+------+------+--------+-------+ | ship | 1228 | 1889 | 0.002 | 0.001 | +-------+------+------+--------+-------+ | mAP | | | | 0.001 | +-------+------+------+--------+-------+

Additional information

This configuration is the configuration of kld about hrsc and rotated_retinanet detector released by Dr. Yang Xue on github. The only difference is that there is one more path, and deleting this path will cause an error. The results of this configuration are far from the data in the paper. The kld configuration data from the toolbox is similar to this data. I don't know what is wrong, hope you can help me, thank you

zytx121 commented 1 year ago

Hi @Kn-Oh, I noticed that you set gpu_ids = range(0, 4). However, all of our KLD experiments only used one GPU.

Kn-Oh commented 1 year ago

Hi @Kn-Oh, I noticed that you set gpu_ids = range(0, 4). However, all of our KLD experiments only used one GPU.

Yes, I ran it with 4 GPUs, and I also adjusted the learning rate to 0.01, but does this have such a big impact on the accuracy? And when I was running, the accuracy was only around 80 in about 50 epochs. It is far from the precision of more than 80 in 12 epochs introduced in the previous HRSC recurring issue. Since I use 4 GPUs to ensure that the map and the highest AP can reach the accuracy of the paper, I increased the epoch to ensure the number of training rounds, but it still cannot reach the accuracy of the paper. Hope you can help me out, thanks

exesit commented 6 months ago

我也存在这种问题