hasanirtiza / Pedestron

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
https://openaccess.thecvf.com/content/CVPR2021/papers/Hasan_Generalizable_Pedestrian_Detection_The_Elephant_in_the_Room_CVPR_2021_paper.pdf
Apache License 2.0
687 stars 159 forks source link

Fine-tune the retinanet-resNext101 model, lose the original weight, and can’t recognize anything #70

Closed mutengchen closed 3 years ago

mutengchen commented 3 years ago

Dear author, how to fine-tune the retinanet model? I customized my own data set according to the train.json in the demo, and used python tools/train.py retinanet_ResNeXt101.py to train my own data set, but the trained model did not recognize Go to any pedestrian data set, and use python tools/train.py retinanet_ResNeXt101.py to train your own data set, but the trained model cannot recognize any pedestrian in the picture. zzzZ

retinanet_ResNext101.py model = dict( type='RetinaNet', pretrained='models_pretrained/epoch_7.pth.stu', backbone=dict( type='ResNeXt', depth=101, groups=64, base_width=4, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs=True, num_outs=5), bbox_head=dict( type='RetinaHead', num_classes=81, in_channels=256, stacked_convs=4, feat_channels=256, octave_base_scale=4, scales_per_octave=3, anchor_ratios=[0.5, 1.0, 2.0], anchor_strides=[8, 16, 32, 64, 128], target_means=[.0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0], loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))

training and testing settings

train_cfg = dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.4, min_pos_iou=0, ignore_iof_thr=-1), allowed_border=-1, pos_weight=-1, debug=False) test_cfg = dict( nms_pre=1000, min_bbox_size=0, score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100)

dataset settings

dataset_type = 'CocoDataset' data_root = 'datasets/CityPersons/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) data = dict( imgs_per_gpu=1, workers_per_gpu=5, train=dict( type=dataset_type, ann_file=data_root + 'train.json', img_prefix=data_root, img_scale=[(1216, 608),(2048, 1024)], multiscale_mode='range', img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0.5, with_mask=False, with_crowd=True, with_label=True, extra_aug=dict( photo_metric_distortion=dict(brightness_delta=180, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18), random_crop=dict(min_ious=(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9), min_crop_size=0.1), ), ), test=dict( type=dataset_type, ann_file=data_root + 'annotations/val_gt_for_mmdetction.json',

img_prefix=data_root,

    img_prefix=data_root + '/leftImg8bit_trainvaltest/leftImg8bit/val_all_in_folder/',
    img_scale=(2048, 1024),
    img_norm_cfg=img_norm_cfg,
    size_divisor=32,
    flip_ratio=0,
    with_mask=False,
    with_label=False,
    test_mode=True))

optimizer

mean_teacher=True optimizer = dict(type='SGD', lr=0.0, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2), mean_teacher = dict(alpha=0.999))

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=1.0 / 3, step=[8, 11]) checkpoint_config = dict(interval=1)

yapf:disable

log_config = dict( interval=1, hooks=[ dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')

])

yapf:enable

runtime settings

total_epochs = 5 device_ids = range(8) dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/retinanet_x101_64x4d_fpn_1x' load_from = None resume_from = None workflow = [('train', 1)]

train.json {"info":{"description":"Example Dataset","url":"https://github.com/waspinator/pycococreator","version":"0.1.0","year":2019,"contributor":"ljp","date_created":"2019-07-25 11:20:43.195866"},"licenses":[{"id":1,"name":"Attribution-NonCommercial-ShareAlike License","url":"http://creativecommons.org/licenses/by-nc-sa/2.0/"}],"categories":[{"id":1,"name":"pedestrain","supercategory":"pedestrain"}],"images":[{"id":1,"file_name":"images/person_2.jpg","width":1920,"height":1080,"date_captured":"2013-11-14 11:18:45","license":1,"coco_url":"","flickr_url":""}],"annotations":[{"id":1,"image_id":1,"category_id":1,"iscrowd":false,"bbox":[100,100,100,100],"width":1920,"height":1080}]}

hasanirtiza commented 3 years ago

For starters, refer to #27, and see how you should fine-tune a model. In short, you should not change the field pretrained='models_pretrained/epoch_7.pth.stu' (remove this and use the pre-trained as provided by the repo) instead there is a field in runtime settings called load_from which is set to None by default. You should pass the path of a pre-trained model here, for example in your case, it should be load_from = 'models_pretrained/epoch_7.pth.stu'