weveng commented 2 years ago

我先使用自己的数据集训练了一个baseline，mAP在30%左右，然后将权重加在了pretrained后边，但在DSL Training阶段发现准确率不是从30%左右开始，而是从零开始，而且涨幅非常慢，训练16个epoch大概才有5%的mAP，请问是哪里出了问题吗？

chenbinghui1 commented 2 years ago

@weveng 请问config内容可以贴一下嘛，

weveng commented 2 years ago

model = dict( type='FCOS', backbone=dict( type='RLA_ResNet', layers=[3,4,6,3],

depth=50,

    #num_stages=4,
    #out_indices=(0, 1, 2, 3),
    frozen_stages=1,
    #norm_cfg=dict(type='BN', requires_grad=False),
    norm_eval=True,
    style='pytorch',
    pretrained='/home/wangrui/zhangzhanming/DSL_all/DSL/work_dirs/r50_caffe_mslonger_tricks_0.Xdata/epoch_12.pth'),
    # pretrained='/home/wangrui/zhangzhanming/DSL_all/DSL/resnet50_rla_2283.pth (1).tar'),
neck=dict(
    type='FPN',
    in_channels=[256, 512, 1024, 2048],
    out_channels=256,
    start_level=1,
    add_extra_convs='on_output',  # use P5
    num_outs=5,
    relu_before_extra_convs=True),
bbox_head=dict(
    type='FCOSHead',
    num_classes=200,
    in_channels=256,
    stacked_convs=4,
    feat_channels=256,
    strides=[8, 16, 32, 64, 128],
    norm_on_bbox=True,
    centerness_on_reg=True,
    dcn_on_last_conv=False,
    center_sampling=True,
    conv_bias=True,
    # partially data use 3.0; full data use 1.0
    loss_weight = 3.0,
    soft_weight = 1.0,
    soft_warm_up = 5000,
    loss_cls=dict(
        type='FocalLoss',
        use_sigmoid=True,
        gamma=2.0,
        alpha=0.25,
        loss_weight=1.0),
    loss_bbox=dict(type='GIoULoss', loss_weight=1.0),
    loss_centerness=dict(
        type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
# training and testing settings
train_cfg=dict(
    assigner=dict(
        type='MaxIoUAssigner',
        pos_iou_thr=0.5,
        neg_iou_thr=0.4,
        min_pos_iou=0,
        ignore_iof_thr=-1),
    allowed_border=-1,
    pos_weight=-1,
    debug=False),
test_cfg=dict(
    nms_pre=1000,
    min_bbox_size=0,
    score_thr=0.05,
    nms=dict(type='nms', iou_threshold=0.6),
    max_per_img=100)

)

img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict(type='PatchShuffle', ratio=0.5, ranges=[0.0,1.0], mode=['flip','flop']), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=['filename', 'ori_filename', 'ori_shape','img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip','flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode']), ] unlabel_train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict(type='PatchShuffle', ratio=0.5, ranges=[0.0,1.0], mode=['flip','flop']), dict(type='RandomFlip', flip_ratio=0.5), dict(type='RandomAugmentBBox_Fast', aug_type='affine'), dict(type='UBAug'), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=['filename', 'ori_filename', 'ori_shape','img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip','flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ]

dataset_type = 'SemiCOCODataset'

DSL-style data root

data_root = '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/'

data = dict( samples_per_gpu=2,

if you change workers_per_gpu, please change preload below at the same time

workers_per_gpu=8,
batch_config=dict(ratio =[[1, 1],]),
train=dict(
    type=dataset_type,
    ann_file = '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train.json',
    ann_path = data_root + 'prepared_annos/Industry/annotations/full/',
    labelmapper = data_root + 'mmdet_category_info.json',
    img_prefix = data_root + 'images/full/',
    pipeline = train_pipeline,
    ),
unlabel_train=dict(
    type=dataset_type,
    ann_file = '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train-unlabeled.json',
    ann_path = data_root + 'unlabel_prepared_annos/Industry/annotations/full/',
    labelmapper = data_root + 'mmdet_category_info.json',
    img_prefix = data_root + 'images/full/',
    pipeline = unlabel_train_pipeline,
    # fixed thres like [0.1, 0.4]; or ada thres
    thres="adathres.json",
    # thres=[0.1, 0.4]
    ),
unlabel_pred=dict(
    type=dataset_type,
    num_gpus = 4,
    image_root_path = data_root + "images/full/",
    image_list_file = '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train-unlabeled.json',
    anno_root_path = data_root + "unlabel_prepared_annos/Industry/annotations/full/",
    category_info_path = data_root + 'mmdet_category_info.json',
    infer_score_thre=0.1,
    save_file_format="json",
    pipeline = test_pipeline,
    eval_config ={"iou":[0.6]},
    img_path = data_root + "images/full/",
    img_resize_size = (1333,800),
    low_level_scale = 16,
    use_ema=True,
    eval_flip=False,
    fuse_history=False, # as ISMT fuse history bboxes by nms
    first_fuse=False,
    first_score_thre=0.1,
    eval_checkpoint_config=dict(interval=1, mode="iteration"),
    # 2*num_worker+2
    preload=18,
    #the start epoch; partial data use 8; alldata don't use by setting 100
    start_point=8),
val=dict(
    type = 'CocoDataset',
    ann_file = '/home/xingyan/Database/ILSVRC/Data/DET/val.json',
    img_prefix = '/home/xingyan/Database/ILSVRC/Data/DET/val',
    pipeline = test_pipeline,
    ),
test=dict(
    type = 'CocoDataset',
    ann_file = '/home/xingyan/Database/ILSVRC/Data/DET/val.json',
    img_prefix = '/home/xingyan/Database/ILSVRC/Data/DET/val',
    pipeline = test_pipeline,
    )

) evaluation = dict(interval=1, metric='bbox')

learning policy

optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001,paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.)) optimizer_config = dict(

delete=True,

grad_clip=dict(max_norm=35, norm_type=2))
#grad_clip=None)

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=1.0 / 3,

partial data use 20-26-28; full data use 20-32-34

step=[20, 26])

runner = dict(type='SemiEpochBasedRunner', max_epochs=28)

checkpoint_config = dict(interval=1) ema_config = dict(interval=1, mode="iteration",ratio=0.99,start_point=1) scale_invariant = True

yapf:disable

log_config = dict( interval=10, hooks=[ dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')

])

yapf:enable

custom_hooks = [dict(type='NumClassCheckHook')]

dist_params = dict(backend='nccl') log_level = 'INFO' load_from =None resume_from = None workflow = [('train', 1)]

chenbinghui1 commented 2 years ago

@weveng 大致发现两个问题（1）第一由于RNN参数也需要imagenet pretrained，所以建议直接load默认的参数，supervised 模型其实只是为提供个初始的伪标签罢了，加载不加载其实影响不大，这个我试过了。（2）lr 仍然从0.01开始，因为有很多新的无标注数据加入，数据分布和特征空间都会变化很大，所以和传统同数据的resume不一样。希望试一下以上两点看看有没有帮助

weveng commented 2 years ago

嗯嗯，多谢

weveng commented 2 years ago

您好，我昨天按照您所说的load_from了权重文件，但mAP值依然是从零开始训练的，难道应该是从之前的基础上开始涨点吗？而且我为了模型一致，还把RLA的backbone换成了和baselin相同的r50，请问这是哪里的问题啊？

weveng commented 2 years ago

model = dict( type='FCOS', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='caffe'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5, relu_before_extra_convs=True), bbox_head=dict( type='FCOSHead', num_classes=200, in_channels=256, stacked_convs=4, feat_channels=256, strides=[8, 16, 32, 64, 128], norm_on_bbox=True, centerness_on_reg=True, dcn_on_last_conv=False, center_sampling=True, conv_bias=True, loss_weight=3.0, soft_weight=1.0, soft_warm_up=5000, loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict(type='GIoULoss', loss_weight=1.0), loss_centerness=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), train_cfg=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.4, min_pos_iou=0, ignore_iof_thr=-1), allowed_border=-1, pos_weight=-1, debug=False), test_cfg=dict( nms_pre=1000, min_bbox_size=0, score_thr=0.05, nms=dict(type='nms', iou_threshold=0.6), max_per_img=100)) img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict( type='PatchShuffle', ratio=0.5, ranges=[0.0, 1.0], mode=['flip', 'flop']), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=[ 'filename', 'ori_filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip', 'flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode' ]) ] unlabel_train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict( type='PatchShuffle', ratio=0.5, ranges=[0.0, 1.0], mode=['flip', 'flop']), dict(type='RandomFlip', flip_ratio=0.5), dict(type='RandomAugmentBBox_Fast', aug_type='affine'), dict(type='UBAug'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=[ 'filename', 'ori_filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip', 'flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode' ]) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] dataset_type = 'SemiCOCODataset' data_root = '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/' data = dict( samples_per_gpu=2, workers_per_gpu=8, batch_config=dict(ratio=[[1, 1]]), train=dict( type='SemiCOCODataset', ann_file= '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train.json', ann_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/prepared_annos/Industry/annotations/full/', labelmapper= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/mmdet_category_info.json', img_prefix= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/images/full/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict( type='PatchShuffle', ratio=0.5, ranges=[0.0, 1.0], mode=['flip', 'flop']), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=[ 'filename', 'ori_filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip', 'flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode' ]) ]), unlabel_train=dict( type='SemiCOCODataset', ann_file= '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train-unlabeled.json', ann_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/unlabel_prepared_annos/Industry/annotations/full/', labelmapper= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/mmdet_category_info.json', img_prefix= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/images/full/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1333, 640), (1333, 800)], multiscale_mode='value', keep_ratio=True), dict( type='PatchShuffle', ratio=0.5, ranges=[0.0, 1.0], mode=['flip', 'flop']), dict(type='RandomFlip', flip_ratio=0.5), dict(type='RandomAugmentBBox_Fast', aug_type='affine'), dict(type='UBAug'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore'], meta_keys=[ 'filename', 'ori_filename', 'ori_shape', 'img_shape', 'pad_shape', 'scale_factor', 'scale_idx', 'flip', 'flip_direction', 'img_norm_cfg', 'PS', 'PS_place', 'PS_mode' ]) ], thres='adathres.json'), unlabel_pred=dict( type='SemiCOCODataset', num_gpus=4, image_root_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/images/full/', image_list_file= '/home/wangrui/zhangzhanming/DSL_all/DSL/semi_supervised/instances_train-unlabeled.json', anno_root_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/unlabel_prepared_annos/Industry/annotations/full/', category_info_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/mmdet_category_info.json', infer_score_thre=0.1, save_file_format='json', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], eval_config=dict(iou=[0.6]), img_path= '/home/wangrui/zhangzhanming/DSL_all/data/semicoco/images/full/', img_resize_size=(1333, 800), low_level_scale=16, use_ema=True, eval_flip=False, fuse_history=False, first_fuse=False, first_score_thre=0.1, eval_checkpoint_config=dict(interval=1, mode='iteration'), preload=18, start_point=8), val=dict( type='CocoDataset', ann_file='/home/xingyan/Database/ILSVRC/Data/DET/val.json', img_prefix='/home/xingyan/Database/ILSVRC/Data/DET/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='/home/xingyan/Database/ILSVRC/Data/DET/val.json', img_prefix='/home/xingyan/Database/ILSVRC/Data/DET/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict(interval=1, metric='bbox') optimizer = dict( type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001, paramwise_cfg=dict(bias_lr_mult=2.0, bias_decay_mult=0.0)) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[20, 26]) runner = dict(type='SemiEpochBasedRunner', max_epochs=28) checkpoint_config = dict(interval=1) ema_config = dict(interval=1, mode='iteration', ratio=0.99, start_point=1) scale_invariant = True log_config = dict(interval=10, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = '/home/wangrui/zhangzhanming/DSL_all/DSL/work_dirs/r50_caffe_mslonger_tricks_0.Xdata/epoch_12.pth' resume_from = None workflow = [('train', 1)] work_dir = 'workdir_coco/0.1data/load' gpu_ids = range(0, 4)

chenbinghui1 commented 2 years ago

@weveng （1）mAP并不会是继续从之前supervised的开始哈，因为(a).新加了无标注数据，模型会从新开始学习 (b)lr开始也比较大，也会从新找新的局部最小值，所以第一个epoch的结果肯定不会比你训好的模型的最佳效果好，主要看到1-8epoch这个过程中的效果是否有合理的变换；（2）不太清楚你的有标注和无标注数据的比例和数量，可能用于unlabel数据的loss_weight需要调整，同时可能训练的epoch个数以及step需要结合你的baseline进行调整对齐（3）建议看一下supervise模型生成出来的初始化伪标签是否合理，（4）我看你的val和test都是Cocodataset 类型的建议核对下DSL-style的category-info 和你用的Cocodataset类别标签，以及顺序是否匹配，传统coco的我是核对过的所以在val和test直接用了COCOdataset类型

weveng commented 2 years ago

我的有标注照片是八万左右，无标注照片大概35万张。

weveng commented 2 years ago

Epoch(val) [1][5031] bbox_mAP: 0.0340, bbox_mAP_50: 0.0580, bbox_mAP_75: 0.0330, bbox_mAP_s: 0.0040, bbox_mAP_m: 0.0180, bbox_mAP_l: 0.0460, bbox_mAP_copypaste: 0.034 0.058 0.033 0.004 0.018 0.046 Epoch(val) [2][5031] bbox_mAP: 0.0440, bbox_mAP_50: 0.0730, bbox_mAP_75: 0.0450, bbox_mAP_s: 0.0080, bbox_mAP_m: 0.0240, bbox_mAP_l: 0.0590, bbox_mAP_copypaste: 0.044 0.073 0.045 0.008 0.024 0.059 Epoch(val) [3][5031] bbox_mAP: 0.0510, bbox_mAP_50: 0.0850, bbox_mAP_75: 0.0520, bbox_mAP_s: 0.0090, bbox_mAP_m: 0.0270, bbox_mAP_l: 0.0700, bbox_mAP_copypaste: 0.051 0.085 0.052 0.009 0.027 0.070 Epoch(val) [4][5031] bbox_mAP: 0.0530, bbox_mAP_50: 0.0870, bbox_mAP_75: 0.0530, bbox_mAP_s: 0.0100, bbox_mAP_m: 0.0260, bbox_mAP_l: 0.0720, bbox_mAP_copypaste: 0.053 0.087 0.053 0.010 0.026 0.072 Epoch(val) [5][5031] bbox_mAP: 0.0590, bbox_mAP_50: 0.0960, bbox_mAP_75: 0.0600, bbox_mAP_s: 0.0100, bbox_mAP_m: 0.0290, bbox_mAP_l: 0.0800, bbox_mAP_copypaste: 0.059 0.096 0.060 0.010 0.029 0.080

weveng commented 2 years ago

这是前五个epoch的训练mAP，您看看涨幅正常吗？

weveng commented 2 years ago

而且baseline和unlabel的训练epoch需要一致吗？

chenbinghui1 commented 2 years ago

@weveng 涨幅需要和你的supervise baseline结合来看你的super应该是30mAP 大概是训了多少epoch呢以及step分别是多少呢

weveng commented 2 years ago

训练了12个epoch，step是[8,11]

chenbinghui1 commented 2 years ago

@weveng 那感觉不正常我再coco的full data protocol下第一个epoch可以到27 至少你这个5.8的相比30mAP还是感觉不太正常的

weveng commented 2 years ago

那主要是哪些方面的原因呢？就还是您上边说的几方面原因是吧？

weveng commented 2 years ago

您这个27%是在unlabel阶段的第一个epoch中的mAP值吗？

chenbinghui1 commented 2 years ago

@weveng 对的 epoch=1时候的测试结果

weveng commented 2 years ago

那您有什么修改方向的建议吗？

weveng commented 2 years ago

主要是参数方面的问题还是数据集方面的问题

chenbinghui1 commented 2 years ago

@weveng 感觉差距这么大应该问题不在算法上面，可能是哪里的操作处理的不太对哈盲猜。（1）确定下参数是否真的加载正确了，其实用rla还是传统的resnet50 只是差1-2个点而已，这么大的结果差距应该不是方式的选择问题导致的。（2）第二是尝试把 https://github.com/chenbinghui1/DSL/blob/45ee8fd1bc267f8d9fb1763d4979d7b0a9efc989/configs/fcos_semi/RLA_r50_caffe_mslonger_tricks_0.Xdata_unlabel_dynamic_lw_nofuse_iterlabel_si-soft_singlestage.py#L35 这个loss_weight改成1试一下，因为不太知道你的loss_cls变化的情况。（3）第三就是确定下数据集的初始无标注伪标签是否合理？DSL-style数据的类别信息和Cocodataset类型的是否相同？因为我看你有200类，所以Cocodataset的代码应该也手动改过吧尤其是里面的类别顺序是否和DSL-style生成的category_info 里面的顺序一样

weveng commented 2 years ago

嗯嗯，多谢，我先再看一下

weveng commented 2 years ago

我如果想在训练之前先看一下验证的准确率应该怎么做呢？我尝试把workflow 改成了 [('val', 1)]，但好像卡在一个地方一直不运行

chenbinghui1 commented 2 years ago

参看readme 里面的Tesing 不用改workflow 这只是config，相关脚本会自动读取需要的字段，把test字段里面的改成你要测试的数据，就行

weveng commented 2 years ago

嗯嗯，我刚查看了类别的顺序，都是一致的

weveng commented 2 years ago

1 { 2 "cat2id": { 3 "n02958343": 0, 4 "n03085013": 1, 5 "n02084071": 2, 6 "n02992211": 3, 7 "n03001627": 4, 8 "n01503061": 5, 9 "n04409515": 6, 10 "n03759954": 7, 11 "n04468005": 8, 12 "n03495258": 9, 13 "n04228054": 10, 14 "n01910747": 11, 15 "n07753275": 12, 16 "n02924116": 13, 17 "n03109150": 14, 18 "n02324045": 15, 19 "n03942813": 16, 20 "n02484322": 17, 21 "n01674464": 18, 22 "n00007846": 19, 23 "n02777292": 20, 24 "n01726692": 21, 25 "n02691156": 22, 26 "n04540053": 23, 27 "n03445777": 24, 28 "n03928116": 25, 29 "n04026417": 26, 30 "n04256520": 27, 31 "n04530566": 28, 32 "n04392985": 29, 33 "n02970849": 30, 34 "n02391049": 31, 35 "n04023962": 32, 36 "n04509417": 33, 37 "n07768694": 34, 38 "n03128519": 35, 39 "n07718747": 36, 40 "n02131653": 37, 41 "n04379243": 38, 42 "n02274259": 39, 43 "n02766320": 40, 44 "n07714571": 41, 45 "n02799071": 42, 46 "n01495701": 43, 47 "n02402425": 44, 48 "n03513137": 45, 49 "n02437136": 46, 50 "n03908714": 47, 51 "n04154565": 48, 52 "n02445715": 49, 53 "n02764044": 50, 54 "n04557648": 51, 55 "n03249569": 52, 56 "n02165456": 53, 57 "n07697537": 54, 58 "n02206856": 55, 59 "n02411705": 56, 60 "n07753592": 57, 61 "n03804744": 58, 62 "n03062245": 59, 63 "n07739125": 60, 64 "n02672831": 61, 65 "n02118333": 62, 66 "n02509815": 63, 67 "n04330267": 64, 68 "n03481172": 65, 69 "n02342885": 66, 70 "n01982650": 67, 71 "n02395003": 68, 72 "n01639765": 69, 73 "n02840245": 70, 74 "n02880940": 71, 75 "n01990800": 72, 76 "n02121808": 73, 77 "n02419796": 74, 78 "n04070727": 75, 79 "n07734744": 76, 80 "n03467517": 77, 81 "n02807133": 78, 82 "n04074963": 79, 83 "n04118538": 80, 84 "n04517823": 81, 85 "n03207941": 82, 86 "n03710721": 83, 87 "n02398521": 84, 88 "n01882714": 85, 89 "n03271574": 86, 90 "n02454379": 87, 91 "n03908618": 88, 92 "n02815834": 89, 93 "n02769748": 90, 94 "n04019541": 91, 95 "n01944390": 92, 96 "n02355227": 93, 97 "n04376876": 94, 98 "n03797390": 95, 99 "n03196217": 96, 100 "n03483316": 97, 101 "n03494278": 98, 102 "n03958227": 99, 103 "n01784675": 100, 104 "n03790512": 101, 105 "n03476991": 102, 106 "n03124170": 103, 107 "n03255030": 104, 108 "n07718472": 105, 109 "n04554684": 106, 110 "n03636649": 107, 111 "n04270147": 108, 112 "n04317175": 109, 113 "n04118776": 110, 114 "n01662784": 111, 115 "n07615774": 112, 116 "n03770439": 113, 117 "n03642806": 114, 118 "n03400231": 115, 119 "n02787622": 116, 120 "n02129604": 117, 121 "n07749582": 118, 122 "n02503517": 119, 123 "n03633091": 120, 124 "n02346627": 121, 125 "n02879718": 122, 126 "n03211117": 123, 127 "n03950228": 124, 128 "n02268443": 125, 129 "n03445924": 126, 130 "n03991062": 127, 131 "n04371430": 128, 132 "n02870880": 129, 133 "n03838899": 130, 134 "n03134739": 131, 135 "n02802426": 132, 136 "n04141076": 133, 137 "n07753113": 134, 138 "n02062744": 135, 139 "n03063338": 136, 140 "n04332243": 137, 141 "n03314780": 138, 142 "n02129165": 139, 143 "n03720891": 140, 144 "n04336792": 141, 145 "n07880968": 142, 146 "n03372029": 143, 147 "n04252077": 144, 148 "n04442312": 145, 149 "n03535780": 146, 150 "n03017168": 147, 151 "n04039381": 148, 152 "n03676483": 149, 153 "n03394916": 150, 154 "n02510455": 151, 155 "n03584254": 152, 156 "n04254680": 153, 157 "n04591713": 154, 158 "n03337140": 155, 159 "n06874185": 156, 160 "n02219486": 157, 161 "n03141823": 158, 162 "n02076196": 159, 163 "n04254120": 160, 164 "n04487394": 161, 165 "n04252225": 162, 166 "n02834778": 163, 167 "n02951585": 164, 168 "n02374451": 165, 169 "n03110669": 166, 170 "n02444819": 167, 171 "n07873807": 168, 172 "n03814639": 169, 173 "n04536866": 170, 174 "n01443537": 171, 175 "n07693725": 172, 176 "n02828884": 173, 177 "n04004767": 174, 178 "n03793489": 175, 179 "n07695742": 176, 180 "n01776313": 177, 181 "n03188531": 178, 182 "n02883205": 179, 183 "n04356056": 180, 184 "n04542943": 181, 185 "n07747607": 182, 186 "n01770393": 183, 187 "n03764736": 184, 188 "n07745940": 185, 189 "n03916031": 186, 190 "n07583066": 187, 191 "n07720875": 188, 192 "n02786058": 189, 193 "n02317335": 190, 194 "n04131690": 191, 195 "n03995372": 192, 196 "n02892767": 193, 197 "n07697100": 194, 198 "n04591157": 195, 199 "n04116512": 196, 200 "n03761084": 197, 201 "n03000684": 198, 202 "n03961711": 199, 203 "背景": 200 204 }, 205 "id2cat": { 206 "0": "n02958343", 207 "1": "n03085013", 208 "2": "n02084071", 209 "3": "n02992211", 210 "4": "n03001627", 211 "5": "n01503061", 212 "6": "n04409515", 213 "7": "n03759954", 214 "8": "n04468005", 215 "9": "n03495258", 216 "10": "n04228054", 217 "11": "n01910747", 218 "12": "n07753275", 219 "13": "n02924116", 220 "14": "n03109150", 221 "15": "n02324045", 222 "16": "n03942813", 223 "17": "n02484322", 224 "18": "n01674464", 225 "19": "n00007846", 226 "20": "n02777292", 227 "21": "n01726692", 228 "22": "n02691156", 229 "23": "n04540053", 230 "24": "n03445777", 231 "25": "n03928116", 232 "26": "n04026417", 233 "27": "n04256520", 234 "28": "n04530566", 235 "29": "n04392985", 236 "30": "n02970849", 237 "31": "n02391049", 238 "32": "n04023962", 239 "33": "n04509417", 240 "34": "n07768694", 241 "35": "n03128519", 242 "36": "n07718747", 243 "37": "n02131653", 244 "38": "n04379243", 245 "39": "n02274259", 246 "40": "n02766320", 247 "41": "n07714571", 248 "42": "n02799071", 249 "43": "n01495701", 250 "44": "n02402425", 251 "45": "n03513137", 252 "46": "n02437136", 253 "47": "n03908714", 254 "48": "n04154565", 255 "49": "n02445715", 256 "50": "n02764044", 257 "51": "n04557648", 258 "52": "n03249569", 259 "53": "n02165456", 260 "54": "n07697537", 261 "55": "n02206856", 262 "56": "n02411705", 263 "57": "n07753592", 264 "58": "n03804744", 265 "59": "n03062245", 266 "60": "n07739125", 267 "61": "n02672831", 268 "62": "n02118333", 269 "63": "n02509815", 270 "64": "n04330267", 271 "65": "n03481172", 272 "66": "n02342885", 273 "67": "n01982650", 274 "68": "n02395003", 275 "69": "n01639765", 276 "70": "n02840245", 277 "71": "n02880940", 278 "72": "n01990800", 279 "73": "n02121808", 280 "74": "n02419796", 281 "75": "n04070727", 282 "76": "n07734744", 283 "77": "n03467517", 284 "78": "n02807133", 285 "79": "n04074963", 286 "80": "n04118538", 287 "81": "n04517823", 288 "82": "n03207941", 289 "83": "n03710721", 290 "84": "n02398521", 291 "85": "n01882714", 292 "86": "n03271574", 293 "87": "n02454379", 294 "88": "n03908618", 295 "89": "n02815834", 296 "90": "n02769748", 297 "91": "n04019541", 298 "92": "n01944390", 299 "93": "n02355227", 300 "94": "n04376876", 301 "95": "n03797390", 302 "96": "n03196217", 303 "97": "n03483316", 304 "98": "n03494278", 305 "99": "n03958227", 306 "100": "n01784675", 307 "101": "n03790512", 308 "102": "n03476991", 309 "103": "n03124170", 310 "104": "n03255030", 311 "105": "n07718472", 312 "106": "n04554684", 313 "107": "n03636649", 314 "108": "n04270147", 315 "109": "n04317175", 316 "110": "n04118776", 317 "111": "n01662784", 318 "112": "n07615774", 319 "113": "n03770439", 320 "114": "n03642806", 321 "115": "n03400231", 322 "116": "n02787622", 323 "117": "n02129604", 324 "118": "n07749582", 325 "119": "n02503517", 326 "120": "n03633091", 327 "121": "n02346627", 328 "122": "n02879718", 329 "123": "n03211117", 330 "124": "n03950228", 331 "125": "n02268443", 332 "126": "n03445924", 333 "127": "n03991062", 334 "128": "n04371430", 335 "129": "n02870880", 336 "130": "n03838899", 337 "131": "n03134739", 338 "132": "n02802426", 339 "133": "n04141076", 340 "134": "n07753113", 341 "135": "n02062744", 342 "136": "n03063338", 343 "137": "n04332243", 344 "138": "n03314780", 345 "139": "n02129165", 346 "140": "n03720891", 347 "141": "n04336792", 348 "142": "n07880968", 349 "143": "n03372029", 350 "144": "n04252077", 351 "145": "n04442312", 352 "146": "n03535780", 353 "147": "n03017168", 354 "148": "n04039381", 355 "149": "n03676483", 356 "150": "n03394916", 357 "151": "n02510455", 358 "152": "n03584254", 359 "153": "n04254680", 360 "154": "n04591713", 361 "155": "n03337140", 362 "156": "n06874185", 363 "157": "n02219486", 364 "158": "n03141823", 365 "159": "n02076196", 366 "160": "n04254120", 367 "161": "n04487394", 368 "162": "n04252225", 369 "163": "n02834778", 370 "164": "n02951585", 371 "165": "n02374451", 372 "166": "n03110669", 373 "167": "n02444819", 374 "168": "n07873807", 375 "169": "n03814639", 376 "170": "n04536866", 377 "171": "n01443537", 378 "172": "n07693725", 379 "173": "n02828884", 380 "174": "n04004767", 381 "175": "n03793489", 382 "176": "n07695742", 383 "177": "n01776313", 384 "178": "n03188531", 385 "179": "n02883205", 386 "180": "n04356056", 387 "181": "n04542943", 388 "182": "n07747607", 389 "183": "n01770393", 390 "184": "n03764736", 391 "185": "n07745940", 392 "186": "n03916031", 393 "187": "n07583066", 394 "188": "n07720875", 395 "189": "n02786058", 396 "190": "n02317335", 397 "191": "n04131690", 398 "192": "n03995372", 399 "193": "n02892767", 400 "194": "n07697100", 401 "195": "n04591157", 402 "196": "n04116512", 403 "197": "n03761084", 404 "198": "n03000684", 405 "199": "n03961711", 406 "200": "背景" 407 } 408 }

chenbinghui1 commented 2 years ago

@weveng 嗯看着没问题比较正常，

weveng commented 2 years ago

我在test测试的时候报错：TypeError: FCOSHead: init() got an unexpected keyword argument 'loss_weight'，在RLA里的head部分多了这个参数，这会不会导致在train时loadfrom模型的mAP值从零开始的原因啊

chenbinghui1 commented 2 years ago

@weveng 这肯定不是的，而且不应该有这个报错的可以看下fcos_head.py的代码，是有keyword的 https://github.com/chenbinghui1/DSL/blob/45ee8fd1bc267f8d9fb1763d4979d7b0a9efc989/mmdet/models/dense_heads/fcos_head.py#L58-L71

weveng commented 2 years ago

确实，我在class fcos_head找到这三个参数了

weveng commented 2 years ago

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "tools/test.py", line 237, in main() File "tools/test.py", line 186, in main model = build_detector(cfg.model, test_cfg=cfg.get('test_cfg')) File "/home/zhaoqiuyu/SoftTeacher/thirdparty/mmdetection/mmdet/models/builder.py", line 59, in build_detector cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/wangrui/opt/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 212, in build return self.build_func(*args, **kwargs, registry=self) File "/home/wangrui/opt/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/home/wangrui/opt/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: FCOS: FCOSHead: init() got an unexpected keyword argument 'loss_weight'

weveng commented 2 years ago

但确实报这个错误了

chenbinghui1 commented 2 years ago

@weveng 那你把它去掉呢看看或者调试debug一下看看fcos_head里面print一下相关的loss_weight值是多少看看，感觉很奇怪哎，盲猜不到原因是哪里

weveng commented 2 years ago

我的电脑上没独立显卡，代码是在服务器上跑的。debug不了

weveng commented 2 years ago

我试试去掉

chenbinghui1 commented 2 years ago

@weveng 一般都是服务器的呀，你可以简单print一下然后运行下就可以的 -_-

weveng commented 2 years ago

可以了

weveng commented 2 years ago

我把那三行注释掉就不报错了，开始test了

weveng commented 2 years ago

test完了后又报错了竟然

weveng commented 2 years ago

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20124/20121, 42.7 task/s, elapsed: 471s, ETA: 0sTraceback (most recent call last): File "tools/test.py", line 237, in main() File "tools/test.py", line 229, in main metric = dataset.evaluate(outputs, **eval_kwargs) File "/home/zhaoqiuyu/SoftTeacher/thirdparty/mmdetection/mmdet/datasets/coco.py", line 641, in evaluate result_files, tmp_dir = self.format_results(results, jsonfile_prefix) File "/home/zhaoqiuyu/SoftTeacher/thirdparty/mmdetection/mmdet/datasets/coco.py", line 383, in format_results result_files = self.results2json(results, jsonfile_prefix) File "/home/zhaoqiuyu/SoftTeacher/thirdparty/mmdetection/mmdet/datasets/coco.py", line 315, in results2json json_results = self._det2json(results) File "/home/zhaoqiuyu/SoftTeacher/thirdparty/mmdetection/mmdet/datasets/coco.py", line 252, in _det2json data['category_id'] = self.cat_ids[label] IndexError: list index out of range

chenbinghui1 commented 2 years ago

@weveng 怎么感觉像是环境用错了呢？这是SoftTeacher的coco.py啊

weveng commented 2 years ago

这个是imagenet的数据集改成的coco格式

chenbinghui1 commented 2 years ago

@weveng 那就不太清楚了，毕竟一开始的keyword就不太对，建议可以先不用自己的数据按照readme试试coco或者voc的如果还不行那大概率环境或者什么地方没弄对了

weveng commented 2 years ago

嗯嗯，谢谢

weveng commented 2 years ago

麻烦了您一天

chenbinghui1 commented 2 years ago

@weveng 应该的祝好，有问题再联系~

weveng commented 2 years ago

嗯嗯

weveng commented 2 years ago

您好，之前的问题解决了，但是我用resume_from的方式导入权重，第一轮的mAP是25点多，但训练到最后只有23点多，想问一下是哪里的原因，是resume_from的问题吗？还是学习率的问题啊，我改的step是[11,24,28],在resume_from之后是7e-4,结束的时候是7e-5

weveng commented 2 years ago

loss_weight = 1.0,

soft_weight = 1.0,

soft_warm_up = 5000,

还是说注释的这三行的影响啊

chenbinghui1 commented 2 years ago

@weveng 学习率的问题吧。感觉太小了新加很多数据的话lr一定是从头开始的；另外你注释的这三个loss_weight用于unlabel数据的权重，soft_weight用于Lscale的权重，soft_warm_up是防止一上来就Lscale从而导致的Nan

weveng commented 2 years ago

那我在train的时候是设置loss_weight为1.0号还是3.0好啊

chenbinghui1 / DSL

how to inference #9

depth=50,

DSL-style data root

if you change workers_per_gpu, please change preload below at the same time

learning policy

delete=True,

learning policy

partial data use 20-26-28; full data use 20-32-34

yapf:disable

dict(type='TensorboardLoggerHook')

yapf:enable

loss_weight = 1.0,

soft_weight = 1.0,

soft_warm_up = 5000,