about result reimplementation of meta-rcnn

JulioZhao97 commented 2 years ago

When trying to reproduce results of meta-rcnn and TFA, under 1 shot setting of split1, I find that reproduced results of meta-rcnn is much higher, which is confusing.In paper of meta-rcnn(this 19.9 is the result i want to get):

In paper of TFA:

Result in paper shows that result of split1 under 1 shot setting is 19.9. But my results is much higher: base training : mAP is 76.2 finetunning : all class is 47.40, novel class is 38.80, base class is 50.53 Which is much higher than results in paper. This is confusing. Besides, in the README.md of meta-rcnn, results are even higher:

under split1 1 shot setting, the results of TFA I get is 40.4 which is basically the same as the paper report.

Could you please kindly answer my questions?

JulioZhao97 commented 2 years ago

Here is my config file of base training of meta-rcnn. img_norm_cfg = dict( mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False) train_multi_pipelines = dict( query=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], support=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]) test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data_root = 'data/VOCdevkit/' data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='NWayKShotDataset', num_support_ways=15, num_support_shots=1, one_support_shot_per_image=True, num_used_support_shots=200, save_dataset=False, dataset=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file= 'data/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'), dict( type='ann_file', ann_file= 'data/VOCdevkit/VOC2012/ImageSets/Main/trainval.txt') ], img_prefix='data/VOCdevkit/', multi_pipelines=dict( query=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], support=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), classes='BASE_CLASSES_SPLIT1', use_difficult=True, instance_wise=False, dataset_name='query_dataset'), support_dataset=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file= 'data/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'), dict( type='ann_file', ann_file= 'data/VOCdevkit/VOC2012/ImageSets/Main/trainval.txt') ], img_prefix='data/VOCdevkit/', multi_pipelines=dict( query=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], support=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), classes='BASE_CLASSES_SPLIT1', use_difficult=False, instance_wise=False, dataset_name='support_dataset')), val=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt') ], img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], classes='BASE_CLASSES_SPLIT1'), test=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt') ], img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], test_mode=True, classes='BASE_CLASSES_SPLIT1'), model_init=dict( copy_from_train_dataset=True, samples_per_gpu=16, workers_per_gpu=1, type='FewShotVOCDataset', ann_cfg=None, img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], use_difficult=False, instance_wise=True, classes='BASE_CLASSES_SPLIT1', dataset_name='model_init_dataset')) evaluation = dict(interval=6000, metric='mAP') optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=100, warmup_ratio=0.001, step=[16000]) runner = dict(type='IterBasedRunner', max_iters=18000) norm_cfg = dict(type='BN', requires_grad=False) pretrained = 'open-mmlab://detectron2/resnet101_caffe' model = dict( type='MetaRCNN', pretrained='open-mmlab://detectron2/resnet101_caffe', backbone=dict( type='ResNetWithMetaConv', depth=101, num_stages=3, strides=(1, 2, 2), dilations=(1, 1, 1), out_indices=(2, ), frozen_stages=2, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='caffe'), rpn_head=dict( type='RPNHead', in_channels=1024, feat_channels=512, anchor_generator=dict( type='AnchorGenerator', scales=[2, 4, 8, 16, 32], ratios=[0.5, 1.0, 2.0], scale_major=False, strides=[16]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0)), roi_head=dict( type='MetaRCNNRoIHead', shared_head=dict( type='MetaRCNNResLayer', pretrained='open-mmlab://detectron2/resnet101_caffe', depth=50, stage=3, stride=2, dilation=1, style='caffe', norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True), bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0), out_channels=1024, featmap_strides=[16]), bbox_head=dict( type='MetaBBoxHead', with_avg_pool=False, roi_feat_size=1, in_channels=2048, num_classes=15, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', loss_weight=1.0), num_meta_classes=15, meta_cls_in_channels=2048, with_meta_cls_loss=True, loss_meta=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), aggregation_layer=dict( type='AggregationLayer', aggregator_cfgs=[ dict( type='DotProductAggregator', in_channels=2048, with_fc=False) ])), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_pre=12000, max_per_img=2000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=128, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_pre=6000, max_per_img=300, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_threshold=0.3), max_per_img=100))) checkpoint_config = dict(interval=6000) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook'),dict(type='TensorboardLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] use_infinite_sampler = True seed = 42 work_dir = './work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_base-training' gpu_ids = range(0, 4)

JulioZhao97 commented 2 years ago

Here is my logfile of fine-tuning stage:

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 2021-12-13 14:48:16,053 - mmfewshot - INFO - Environment info:

sys.platform: linux Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] CUDA available: True GPU 0,1,2,3: GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.243 GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.7.0 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 10.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.2
Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.0 OpenCV: 4.5.4 MMCV: 1.4.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMDetection: 2.19.0+

2021-12-13 14:48:16,591 - mmfewshot - INFO - Distributed training: True /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' 2021-12-13 14:48:17,084 - mmfewshot - INFO - Config: img_norm_cfg = dict( mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False) train_multi_pipelines = dict( query=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], support=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]) test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data_root = 'data/VOCdevkit/' data = dict( samples_per_gpu=4, workers_per_gpu=2, train=dict( type='NWayKShotDataset', num_support_ways=20, num_support_shots=1, one_support_shot_per_image=False, num_used_support_shots=30, save_dataset=True, dataset=dict( type='FewShotVOCDefaultDataset', ann_cfg=[dict(method='MetaRCNN', setting='SPLIT1_1SHOT')], img_prefix='data/VOCdevkit/', multi_pipelines=dict( query=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], support=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), classes='ALL_CLASSES_SPLIT1', use_difficult=False, instance_wise=False, dataset_name='query_support_dataset', num_novel_shots=1, num_base_shots=1)), val=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt') ], img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], classes='ALL_CLASSES_SPLIT1'), test=dict( type='FewShotVOCDataset', ann_cfg=[ dict( type='ann_file', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt') ], img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], test_mode=True, classes='ALL_CLASSES_SPLIT1'), model_init=dict( copy_from_train_dataset=True, samples_per_gpu=16, workers_per_gpu=1, type='FewShotVOCDataset', ann_cfg=None, img_prefix='data/VOCdevkit/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Normalize', mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False), dict(type='GenerateMask', target_size=(224, 224)), dict(type='RandomFlip', flip_ratio=0.0), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], use_difficult=False, instance_wise=True, num_novel_shots=None, classes='ALL_CLASSES_SPLIT1', dataset_name='model_init_dataset')) evaluation = dict( interval=50, metric='mAP', class_splits=['BASE_CLASSES_SPLIT1', 'NOVEL_CLASSES_SPLIT1']) optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup=None, warmup_iters=500, warmup_ratio=0.001, step=[60000, 80000]) runner = dict(type='IterBasedRunner', max_iters=100) norm_cfg = dict(type='BN', requires_grad=False) pretrained = 'open-mmlab://detectron2/resnet101_caffe' model = dict( type='MetaRCNN', pretrained='open-mmlab://detectron2/resnet101_caffe', backbone=dict( type='ResNetWithMetaConv', depth=101, num_stages=3, strides=(1, 2, 2), dilations=(1, 1, 1), out_indices=(2, ), frozen_stages=2, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='caffe'), rpn_head=dict( type='RPNHead', in_channels=1024, feat_channels=512, anchor_generator=dict( type='AnchorGenerator', scales=[2, 4, 8, 16, 32], ratios=[0.5, 1.0, 2.0], scale_major=False, strides=[16]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0)), roi_head=dict( type='MetaRCNNRoIHead', shared_head=dict( type='MetaRCNNResLayer', pretrained='open-mmlab://detectron2/resnet101_caffe', depth=50, stage=3, stride=2, dilation=1, style='caffe', norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True), bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0), out_channels=1024, featmap_strides=[16]), bbox_head=dict( type='MetaBBoxHead', with_avg_pool=False, roi_feat_size=1, in_channels=2048, num_classes=20, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', loss_weight=1.0), num_meta_classes=20, meta_cls_in_channels=2048, with_meta_cls_loss=True, loss_meta=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), aggregation_layer=dict( type='AggregationLayer', aggregator_cfgs=[ dict( type='DotProductAggregator', in_channels=2048, with_fc=False) ])), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_pre=12000, max_per_img=2000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=128, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_pre=6000, max_per_img=300, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_threshold=0.3), max_per_img=100)), frozen_parameters=[ 'backbone', 'shared_head', 'rpn_head', 'aggregation_layer' ]) checkpoint_config = dict(interval=50) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = 'work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_base-training/latest.pth' resume_from = None workflow = [('train', 1)] use_infinite_sampler = True seed = 42 work_dir = './work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_1shot-fine-tuning' gpu_ids = range(0, 4)

2021-12-13 14:48:17,084 - mmfewshot - INFO - Set random seed to 42, deterministic: False /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/roi_heads/shared_heads/res_layer.py:54: UserWarning: DeprecationWarning: pretrained is a deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is a deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/roi_heads/shared_heads/res_layer.py:54: UserWarning: DeprecationWarning: pretrained is a deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is a deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/roi_heads/shared_heads/res_layer.py:54: UserWarning: DeprecationWarning: pretrained is a deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is a deprecated, ' /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/models/roi_heads/shared_heads/res_layer.py:54: UserWarning: DeprecationWarning: pretrained is a deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is a deprecated, ' 2021-12-13 14:48:17,555 - mmfewshot - INFO - initialize ResNetWithMetaConv with init_cfg {'type': 'Pretrained', 'checkpoint': 'open-mmlab://detectron2/resnet101_caffe'} 2021-12-13 14:48:17,556 - mmcv - INFO - load model from: open-mmlab://detectron2/resnet101_caffe 2021-12-13 14:48:17,556 - mmcv - INFO - load checkpoint from openmmlab path: open-mmlab://detectron2/resnet101_caffe 2021-12-13 14:48:17,718 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: layer4.0.downsample.0.weight, layer4.0.downsample.1.bias, layer4.0.downsample.1.weight, layer4.0.downsample.1.running_mean, layer4.0.downsample.1.running_var, layer4.0.conv1.weight, layer4.0.bn1.bias, layer4.0.bn1.weight, layer4.0.bn1.running_mean, layer4.0.bn1.running_var, layer4.0.conv2.weight, layer4.0.bn2.bias, layer4.0.bn2.weight, layer4.0.bn2.running_mean, layer4.0.bn2.running_var, layer4.0.conv3.weight, layer4.0.bn3.bias, layer4.0.bn3.weight, layer4.0.bn3.running_mean, layer4.0.bn3.running_var, layer4.1.conv1.weight, layer4.1.bn1.bias, layer4.1.bn1.weight, layer4.1.bn1.running_mean, layer4.1.bn1.running_var, layer4.1.conv2.weight, layer4.1.bn2.bias, layer4.1.bn2.weight, layer4.1.bn2.running_mean, layer4.1.bn2.running_var, layer4.1.conv3.weight, layer4.1.bn3.bias, layer4.1.bn3.weight, layer4.1.bn3.running_mean, layer4.1.bn3.running_var, layer4.2.conv1.weight, layer4.2.bn1.bias, layer4.2.bn1.weight, layer4.2.bn1.running_mean, layer4.2.bn1.running_var, layer4.2.conv2.weight, layer4.2.bn2.bias, layer4.2.bn2.weight, layer4.2.bn2.running_mean, layer4.2.bn2.running_var, layer4.2.conv3.weight, layer4.2.bn3.bias, layer4.2.bn3.weight, layer4.2.bn3.running_mean, layer4.2.bn3.running_var

missing keys in source state_dict: meta_conv.weight

2021-12-13 14:48:17,776 - mmfewshot - INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01} 2021-12-13 14:48:17,838 - mmfewshot - INFO - initialize MetaRCNNResLayer with init_cfg {'type': 'Pretrained', 'checkpoint': 'open-mmlab://detectron2/resnet101_caffe'} 2021-12-13 14:48:17,838 - mmcv - INFO - load model from: open-mmlab://detectron2/resnet101_caffe 2021-12-13 14:48:17,838 - mmcv - INFO - load checkpoint from openmmlab path: open-mmlab://detectron2/resnet101_caffe 2021-12-13 14:48:17,957 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: conv1.weight, bn1.bias, bn1.weight, bn1.running_mean, bn1.running_var, layer1.0.downsample.0.weight, layer1.0.downsample.1.bias, layer1.0.downsample.1.weight, layer1.0.downsample.1.running_mean, layer1.0.downsample.1.running_var, layer1.0.conv1.weight, layer1.0.bn1.bias, layer1.0.bn1.weight, layer1.0.bn1.running_mean, layer1.0.bn1.running_var, layer1.0.conv2.weight, layer1.0.bn2.bias, layer1.0.bn2.weight, layer1.0.bn2.running_mean, layer1.0.bn2.running_var, layer1.0.conv3.weight, layer1.0.bn3.bias, layer1.0.bn3.weight, layer1.0.bn3.running_mean, layer1.0.bn3.running_var, layer1.1.conv1.weight, layer1.1.bn1.bias, layer1.1.bn1.weight, layer1.1.bn1.running_mean, layer1.1.bn1.running_var, layer1.1.conv2.weight, layer1.1.bn2.bias, layer1.1.bn2.weight, layer1.1.bn2.running_mean, layer1.1.bn2.running_var, layer1.1.conv3.weight, layer1.1.bn3.bias, layer1.1.bn3.weight, layer1.1.bn3.running_mean, layer1.1.bn3.running_var, layer1.2.conv1.weight, layer1.2.bn1.bias, layer1.2.bn1.weight, layer1.2.bn1.running_mean, layer1.2.bn1.running_var, layer1.2.conv2.weight, layer1.2.bn2.bias, layer1.2.bn2.weight, layer1.2.bn2.running_mean, layer1.2.bn2.running_var, layer1.2.conv3.weight, layer1.2.bn3.bias, layer1.2.bn3.weight, layer1.2.bn3.running_mean, layer1.2.bn3.running_var, layer2.0.downsample.0.weight, layer2.0.downsample.1.bias, layer2.0.downsample.1.weight, layer2.0.downsample.1.running_mean, layer2.0.downsample.1.running_var, layer2.0.conv1.weight, layer2.0.bn1.bias, layer2.0.bn1.weight, layer2.0.bn1.running_mean, layer2.0.bn1.running_var, layer2.0.conv2.weight, layer2.0.bn2.bias, layer2.0.bn2.weight, layer2.0.bn2.running_mean, layer2.0.bn2.running_var, layer2.0.conv3.weight, layer2.0.bn3.bias, layer2.0.bn3.weight, layer2.0.bn3.running_mean, layer2.0.bn3.running_var, layer2.1.conv1.weight, layer2.1.bn1.bias, layer2.1.bn1.weight, layer2.1.bn1.running_mean, layer2.1.bn1.running_var, layer2.1.conv2.weight, layer2.1.bn2.bias, layer2.1.bn2.weight, layer2.1.bn2.running_mean, layer2.1.bn2.running_var, layer2.1.conv3.weight, layer2.1.bn3.bias, layer2.1.bn3.weight, layer2.1.bn3.running_mean, layer2.1.bn3.running_var, layer2.2.conv1.weight, layer2.2.bn1.bias, layer2.2.bn1.weight, layer2.2.bn1.running_mean, layer2.2.bn1.running_var, layer2.2.conv2.weight, layer2.2.bn2.bias, layer2.2.bn2.weight, layer2.2.bn2.running_mean, layer2.2.bn2.running_var, layer2.2.conv3.weight, layer2.2.bn3.bias, layer2.2.bn3.weight, layer2.2.bn3.running_mean, layer2.2.bn3.running_var, layer2.3.conv1.weight, layer2.3.bn1.bias, layer2.3.bn1.weight, layer2.3.bn1.running_mean, layer2.3.bn1.running_var, layer2.3.conv2.weight, layer2.3.bn2.bias, layer2.3.bn2.weight, layer2.3.bn2.running_mean, layer2.3.bn2.running_var, layer2.3.conv3.weight, layer2.3.bn3.bias, layer2.3.bn3.weight, layer2.3.bn3.running_mean, layer2.3.bn3.running_var, layer3.0.downsample.0.weight, layer3.0.downsample.1.bias, layer3.0.downsample.1.weight, layer3.0.downsample.1.running_mean, layer3.0.downsample.1.running_var, layer3.0.conv1.weight, layer3.0.bn1.bias, layer3.0.bn1.weight, layer3.0.bn1.running_mean, layer3.0.bn1.running_var, layer3.0.conv2.weight, layer3.0.bn2.bias, layer3.0.bn2.weight, layer3.0.bn2.running_mean, layer3.0.bn2.running_var, layer3.0.conv3.weight, layer3.0.bn3.bias, layer3.0.bn3.weight, layer3.0.bn3.running_mean, layer3.0.bn3.running_var, layer3.1.conv1.weight, layer3.1.bn1.bias, layer3.1.bn1.weight, layer3.1.bn1.running_mean, layer3.1.bn1.running_var, layer3.1.conv2.weight, layer3.1.bn2.bias, layer3.1.bn2.weight, layer3.1.bn2.running_mean, layer3.1.bn2.running_var, layer3.1.conv3.weight, layer3.1.bn3.bias, layer3.1.bn3.weight, layer3.1.bn3.running_mean, layer3.1.bn3.running_var, layer3.2.conv1.weight, layer3.2.bn1.bias, layer3.2.bn1.weight, layer3.2.bn1.running_mean, layer3.2.bn1.running_var, layer3.2.conv2.weight, layer3.2.bn2.bias, layer3.2.bn2.weight, layer3.2.bn2.running_mean, layer3.2.bn2.running_var, layer3.2.conv3.weight, layer3.2.bn3.bias, layer3.2.bn3.weight, layer3.2.bn3.running_mean, layer3.2.bn3.running_var, layer3.3.conv1.weight, layer3.3.bn1.bias, layer3.3.bn1.weight, layer3.3.bn1.running_mean, layer3.3.bn1.running_var, layer3.3.conv2.weight, layer3.3.bn2.bias, layer3.3.bn2.weight, layer3.3.bn2.running_mean, layer3.3.bn2.running_var, layer3.3.conv3.weight, layer3.3.bn3.bias, layer3.3.bn3.weight, layer3.3.bn3.running_mean, layer3.3.bn3.running_var, layer3.4.conv1.weight, layer3.4.bn1.bias, layer3.4.bn1.weight, layer3.4.bn1.running_mean, layer3.4.bn1.running_var, layer3.4.conv2.weight, layer3.4.bn2.bias, layer3.4.bn2.weight, layer3.4.bn2.running_mean, layer3.4.bn2.running_var, layer3.4.conv3.weight, layer3.4.bn3.bias, layer3.4.bn3.weight, layer3.4.bn3.running_mean, layer3.4.bn3.running_var, layer3.5.conv1.weight, layer3.5.bn1.bias, layer3.5.bn1.weight, layer3.5.bn1.running_mean, layer3.5.bn1.running_var, layer3.5.conv2.weight, layer3.5.bn2.bias, layer3.5.bn2.weight, layer3.5.bn2.running_mean, layer3.5.bn2.running_var, layer3.5.conv3.weight, layer3.5.bn3.bias, layer3.5.bn3.weight, layer3.5.bn3.running_mean, layer3.5.bn3.running_var, layer3.6.conv1.weight, layer3.6.bn1.bias, layer3.6.bn1.weight, layer3.6.bn1.running_mean, layer3.6.bn1.running_var, layer3.6.conv2.weight, layer3.6.bn2.bias, layer3.6.bn2.weight, layer3.6.bn2.running_mean, layer3.6.bn2.running_var, layer3.6.conv3.weight, layer3.6.bn3.bias, layer3.6.bn3.weight, layer3.6.bn3.running_mean, layer3.6.bn3.running_var, layer3.7.conv1.weight, layer3.7.bn1.bias, layer3.7.bn1.weight, layer3.7.bn1.running_mean, layer3.7.bn1.running_var, layer3.7.conv2.weight, layer3.7.bn2.bias, layer3.7.bn2.weight, layer3.7.bn2.running_mean, layer3.7.bn2.running_var, layer3.7.conv3.weight, layer3.7.bn3.bias, layer3.7.bn3.weight, layer3.7.bn3.running_mean, layer3.7.bn3.running_var, layer3.8.conv1.weight, layer3.8.bn1.bias, layer3.8.bn1.weight, layer3.8.bn1.running_mean, layer3.8.bn1.running_var, layer3.8.conv2.weight, layer3.8.bn2.bias, layer3.8.bn2.weight, layer3.8.bn2.running_mean, layer3.8.bn2.running_var, layer3.8.conv3.weight, layer3.8.bn3.bias, layer3.8.bn3.weight, layer3.8.bn3.running_mean, layer3.8.bn3.running_var, layer3.9.conv1.weight, layer3.9.bn1.bias, layer3.9.bn1.weight, layer3.9.bn1.running_mean, layer3.9.bn1.running_var, layer3.9.conv2.weight, layer3.9.bn2.bias, layer3.9.bn2.weight, layer3.9.bn2.running_mean, layer3.9.bn2.running_var, layer3.9.conv3.weight, layer3.9.bn3.bias, layer3.9.bn3.weight, layer3.9.bn3.running_mean, layer3.9.bn3.running_var, layer3.10.conv1.weight, layer3.10.bn1.bias, layer3.10.bn1.weight, layer3.10.bn1.running_mean, layer3.10.bn1.running_var, layer3.10.conv2.weight, layer3.10.bn2.bias, layer3.10.bn2.weight, layer3.10.bn2.running_mean, layer3.10.bn2.running_var, layer3.10.conv3.weight, layer3.10.bn3.bias, layer3.10.bn3.weight, layer3.10.bn3.running_mean, layer3.10.bn3.running_var, layer3.11.conv1.weight, layer3.11.bn1.bias, layer3.11.bn1.weight, layer3.11.bn1.running_mean, layer3.11.bn1.running_var, layer3.11.conv2.weight, layer3.11.bn2.bias, layer3.11.bn2.weight, layer3.11.bn2.running_mean, layer3.11.bn2.running_var, layer3.11.conv3.weight, layer3.11.bn3.bias, layer3.11.bn3.weight, layer3.11.bn3.running_mean, layer3.11.bn3.running_var, layer3.12.conv1.weight, layer3.12.bn1.bias, layer3.12.bn1.weight, layer3.12.bn1.running_mean, layer3.12.bn1.running_var, layer3.12.conv2.weight, layer3.12.bn2.bias, layer3.12.bn2.weight, layer3.12.bn2.running_mean, layer3.12.bn2.running_var, layer3.12.conv3.weight, layer3.12.bn3.bias, layer3.12.bn3.weight, layer3.12.bn3.running_mean, layer3.12.bn3.running_var, layer3.13.conv1.weight, layer3.13.bn1.bias, layer3.13.bn1.weight, layer3.13.bn1.running_mean, layer3.13.bn1.running_var, layer3.13.conv2.weight, layer3.13.bn2.bias, layer3.13.bn2.weight, layer3.13.bn2.running_mean, layer3.13.bn2.running_var, layer3.13.conv3.weight, layer3.13.bn3.bias, layer3.13.bn3.weight, layer3.13.bn3.running_mean, layer3.13.bn3.running_var, layer3.14.conv1.weight, layer3.14.bn1.bias, layer3.14.bn1.weight, layer3.14.bn1.running_mean, layer3.14.bn1.running_var, layer3.14.conv2.weight, layer3.14.bn2.bias, layer3.14.bn2.weight, layer3.14.bn2.running_mean, layer3.14.bn2.running_var, layer3.14.conv3.weight, layer3.14.bn3.bias, layer3.14.bn3.weight, layer3.14.bn3.running_mean, layer3.14.bn3.running_var, layer3.15.conv1.weight, layer3.15.bn1.bias, layer3.15.bn1.weight, layer3.15.bn1.running_mean, layer3.15.bn1.running_var, layer3.15.conv2.weight, layer3.15.bn2.bias, layer3.15.bn2.weight, layer3.15.bn2.running_mean, layer3.15.bn2.running_var, layer3.15.conv3.weight, layer3.15.bn3.bias, layer3.15.bn3.weight, layer3.15.bn3.running_mean, layer3.15.bn3.running_var, layer3.16.conv1.weight, layer3.16.bn1.bias, layer3.16.bn1.weight, layer3.16.bn1.running_mean, layer3.16.bn1.running_var, layer3.16.conv2.weight, layer3.16.bn2.bias, layer3.16.bn2.weight, layer3.16.bn2.running_mean, layer3.16.bn2.running_var, layer3.16.conv3.weight, layer3.16.bn3.bias, layer3.16.bn3.weight, layer3.16.bn3.running_mean, layer3.16.bn3.running_var, layer3.17.conv1.weight, layer3.17.bn1.bias, layer3.17.bn1.weight, layer3.17.bn1.running_mean, layer3.17.bn1.running_var, layer3.17.conv2.weight, layer3.17.bn2.bias, layer3.17.bn2.weight, layer3.17.bn2.running_mean, layer3.17.bn2.running_var, layer3.17.conv3.weight, layer3.17.bn3.bias, layer3.17.bn3.weight, layer3.17.bn3.running_mean, layer3.17.bn3.running_var, layer3.18.conv1.weight, layer3.18.bn1.bias, layer3.18.bn1.weight, layer3.18.bn1.running_mean, layer3.18.bn1.running_var, layer3.18.conv2.weight, layer3.18.bn2.bias, layer3.18.bn2.weight, layer3.18.bn2.running_mean, layer3.18.bn2.running_var, layer3.18.conv3.weight, layer3.18.bn3.bias, layer3.18.bn3.weight, layer3.18.bn3.running_mean, layer3.18.bn3.running_var, layer3.19.conv1.weight, layer3.19.bn1.bias, layer3.19.bn1.weight, layer3.19.bn1.running_mean, layer3.19.bn1.running_var, layer3.19.conv2.weight, layer3.19.bn2.bias, layer3.19.bn2.weight, layer3.19.bn2.running_mean, layer3.19.bn2.running_var, layer3.19.conv3.weight, layer3.19.bn3.bias, layer3.19.bn3.weight, layer3.19.bn3.running_mean, layer3.19.bn3.running_var, layer3.20.conv1.weight, layer3.20.bn1.bias, layer3.20.bn1.weight, layer3.20.bn1.running_mean, layer3.20.bn1.running_var, layer3.20.conv2.weight, layer3.20.bn2.bias, layer3.20.bn2.weight, layer3.20.bn2.running_mean, layer3.20.bn2.running_var, layer3.20.conv3.weight, layer3.20.bn3.bias, layer3.20.bn3.weight, layer3.20.bn3.running_mean, layer3.20.bn3.running_var, layer3.21.conv1.weight, layer3.21.bn1.bias, layer3.21.bn1.weight, layer3.21.bn1.running_mean, layer3.21.bn1.running_var, layer3.21.conv2.weight, layer3.21.bn2.bias, layer3.21.bn2.weight, layer3.21.bn2.running_mean, layer3.21.bn2.running_var, layer3.21.conv3.weight, layer3.21.bn3.bias, layer3.21.bn3.weight, layer3.21.bn3.running_mean, layer3.21.bn3.running_var, layer3.22.conv1.weight, layer3.22.bn1.bias, layer3.22.bn1.weight, layer3.22.bn1.running_mean, layer3.22.bn1.running_var, layer3.22.conv2.weight, layer3.22.bn2.bias, layer3.22.bn2.weight, layer3.22.bn2.running_mean, layer3.22.bn2.running_var, layer3.22.conv3.weight, layer3.22.bn3.bias, layer3.22.bn3.weight, layer3.22.bn3.running_mean, layer3.22.bn3.running_var

2021-12-13 14:48:17,978 - mmfewshot - INFO - initialize MetaBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}] 2021-12-13 14:48:17,998 - mmfewshot - INFO - Frozen parameters: ['backbone', 'shared_head', 'rpn_head', 'aggregation_layer'] 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_cls.weight 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_cls.bias 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_reg.weight 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_reg.bias 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_meta.weight 2021-12-13 14:48:18,001 - mmfewshot - INFO - Training parameters: roi_head.bbox_head.fc_meta.bias 2021-12-13 14:48:18,008 - mmfewshot - INFO - FewShotVOCDefaultDataset query_support_dataset with number of images 17, and instance counts: +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | category | count | category | count | category | count | category | count | category | count | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | 0 [aeroplane] | 1 | 1 [bicycle] | 1 | 2 [boat] | 1 | 3 [bottle] | 1 | 4 [car] | 1 | | 5 [cat] | 1 | 6 [chair] | 1 | 7 [diningtable] | 1 | 8 [dog] | 1 | 9 [horse] | 1 | | 10 [person] | 1 | 11 [pottedplant] | 1 | 12 [sheep] | 1 | 13 [train] | 1 | 14 [tvmonitor] | 1 | | 15 [bird] | 1 | 16 [bus] | 1 | 17 [cow] | 1 | 18 [motorbike] | 1 | 19 [sofa] | 1 | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). fatal: Not a git repository (or any parent up to mount point /opt/data/nfs) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 2021-12-13 14:48:19,507 - mmfewshot - INFO - FewShotVOCDataset Test dataset with number of images 4952, and instance counts: +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | category | count | category | count | category | count | category | count | category | count | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | 0 [aeroplane] | 285 | 1 [bicycle] | 337 | 2 [boat] | 263 | 3 [bottle] | 469 | 4 [car] | 1201 | | 5 [cat] | 358 | 6 [chair] | 756 | 7 [diningtable] | 206 | 8 [dog] | 489 | 9 [horse] | 348 | | 10 [person] | 4528 | 11 [pottedplant] | 480 | 12 [sheep] | 242 | 13 [train] | 282 | 14 [tvmonitor] | 308 | | 15 [bird] | 459 | 16 [bus] | 213 | 17 [cow] | 244 | 18 [motorbike] | 325 | 19 [sofa] | 239 | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ 2021-12-13 14:48:19,510 - mmfewshot - INFO - FewShotVOCCopyDataset model_init_dataset with number of images 20, and instance counts: +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | category | count | category | count | category | count | category | count | category | count | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ | 0 [aeroplane] | 1 | 1 [bicycle] | 1 | 2 [boat] | 1 | 3 [bottle] | 1 | 4 [car] | 1 | | 5 [cat] | 1 | 6 [chair] | 1 | 7 [diningtable] | 1 | 8 [dog] | 1 | 9 [horse] | 1 | | 10 [person] | 1 | 11 [pottedplant] | 1 | 12 [sheep] | 1 | 13 [train] | 1 | 14 [tvmonitor] | 1 | | 15 [bird] | 1 | 16 [bus] | 1 | 17 [cow] | 1 | 18 [motorbike] | 1 | 19 [sofa] | 1 | +---------------+-------+------------------+-------+-----------------+-------+----------------+-------+----------------+-------+ 2021-12-13 14:48:19,511 - mmfewshot - INFO - load checkpoint from local path: work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_base-training/latest.pth 2021-12-13 14:48:20,093 - mmfewshot - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([16, 2048]) from checkpoint, the shape in current model is torch.Size([21, 2048]). size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([21]). size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([60, 2048]) from checkpoint, the shape in current model is torch.Size([80, 2048]). size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([60]) from checkpoint, the shape in current model is torch.Size([80]). size mismatch for roi_head.bbox_head.fc_meta.weight: copying a param with shape torch.Size([15, 2048]) from checkpoint, the shape in current model is torch.Size([20, 2048]). size mismatch for roi_head.bbox_head.fc_meta.bias: copying a param with shape torch.Size([15]) from checkpoint, the shape in current model is torch.Size([20]). 2021-12-13 14:48:20,102 - mmfewshot - INFO - Start running, host: zhaozhiyuan@admin.cluster.local, work_dir: /opt/data/nfs/zhaozhiyuan/mmfewshot-main/work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_1shot-fine-tuning 2021-12-13 14:48:20,102 - mmfewshot - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(LOW ) QuerySupportDistEvalHook
(VERY_LOW ) TextLoggerHook

before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(LOW ) QuerySupportDistEvalHook
(VERY_LOW ) TextLoggerHook

before_train_iter: (VERY_HIGH ) StepLrUpdaterHook
(LOW ) IterTimerHook
(LOW ) QuerySupportDistEvalHook

after_train_iter: (ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(LOW ) IterTimerHook
(LOW ) QuerySupportDistEvalHook
(VERY_LOW ) TextLoggerHook

after_train_epoch: (NORMAL ) CheckpointHook
(LOW ) QuerySupportDistEvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch: (NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter: (LOW ) IterTimerHook

after_val_iter: (LOW ) IterTimerHook

after_val_epoch: (VERY_LOW ) TextLoggerHook

after_run: (VERY_LOW ) TextLoggerHook

2021-12-13 14:48:20,102 - mmfewshot - INFO - workflow: [('train', 1)], max: 100 iters 2021-12-13 14:48:20,103 - mmfewshot - INFO - Checkpoints will be saved to /opt/data/nfs/zhaozhiyuan/mmfewshot-main/work_dirs/meta-rcnn_r101_c4_8xb4_voc-split1_1shot-fine-tuning by HardDiskBackend. 2021-12-13 14:48:31,525 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration. 2021-12-13 14:48:49,580 - mmfewshot - INFO - Saving checkpoint at 50 iterations 2021-12-13 14:48:50,770 - mmfewshot - INFO - Iter [50/100] lr: 1.000e-03, eta: 0:00:30, time: 0.608, data_time: 0.239, memory: 1321, loss_rpn_cls: 0.0521, loss_rpn_bbox: 0.0438, loss_cls: 0.8361, loss_bbox: 0.4429, acc: 79.0078, loss_meta_cls: 0.1514, meta_acc: 5.8000, loss: 1.5264 2021-12-13 14:48:50,780 - mmdet - INFO - starting model initialization... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 1.7 task/s, elapsed: 12s, ETA: 0s2021-12-13 14:49:03,107 - mmdet - INFO - model initialization done. [>>>>>>>>>>>>>>>>>>>>>>>>>>] 4952/4952, 45.0 task/s, elapsed: 110s, ETA: 0s

---------------iou_thr: 0.5--------------- /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/meanap.py:203: DeprecationWarning: np.bool is a deprecated alias for the builtin bool. To silence this warning, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.boolhere. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), 2021-12-13 14:51:37,388 - mmfewshot - INFO - +-------------+------+-------+--------+-------+ | class | gts | dets | recall | ap | +-------------+------+-------+--------+-------+ | aeroplane | 285 | 2494 | 0.775 | 0.531 | | bicycle | 337 | 3821 | 0.825 | 0.731 | | boat | 263 | 8055 | 0.802 | 0.443 | | bottle | 469 | 9766 | 0.676 | 0.400 | | car | 1201 | 1666 | 0.648 | 0.467 | | cat | 358 | 2084 | 0.810 | 0.270 | | chair | 756 | 2095 | 0.442 | 0.142 | | diningtable | 206 | 5 | 0.000 | 0.000 | | dog | 489 | 2523 | 0.879 | 0.539 | | horse | 348 | 5847 | 0.891 | 0.674 | | person | 4528 | 1969 | 0.206 | 0.131 | | pottedplant | 480 | 845 | 0.331 | 0.107 | | sheep | 242 | 11581 | 0.872 | 0.355 | | train | 282 | 12782 | 0.911 | 0.419 | | tvmonitor | 308 | 14175 | 0.877 | 0.129 | | bird | 459 | 7784 | 0.660 | 0.273 | | bus | 213 | 9966 | 0.812 | 0.516 | | cow | 244 | 8817 | 0.898 | 0.136 | | motorbike | 325 | 3641 | 0.778 | 0.515 | | sofa | 239 | 7668 | 0.582 | 0.191 | +-------------+------+-------+--------+-------+ | mAP | | | | 0.348 | +-------------+------+-------+--------+-------+ 2021-12-13 14:51:37,391 - mmfewshot - INFO - BASE_CLASSES_SPLIT1 mAP: 0.35582563281059265 2021-12-13 14:51:37,391 - mmfewshot - INFO - NOVEL_CLASSES_SPLIT1 mAP: 0.3261883556842804 2021-12-13 14:51:37,419 - mmfewshot - INFO - Iter(val) [1238] AP50: 0.3480, BASE_CLASSES_SPLIT1: AP50: 0.3560, NOVEL_CLASSES_SPLIT1: AP50: 0.3260, mAP: 0.3484 2021-12-13 14:51:57,238 - mmfewshot - INFO - Saving checkpoint at 100 iterations 2021-12-13 14:51:58,839 - mmfewshot - INFO - Iter [100/100] lr: 1.000e-03, eta: 0:00:00, time: 3.761, data_time: 3.363, memory: 1321, loss_rpn_cls: 0.0518, loss_rpn_bbox: 0.0434, loss_cls: 0.4260, loss_bbox: 0.3942, acc: 92.1953, loss_meta_cls: 0.1459, meta_acc: 21.8000, loss: 1.0613 2021-12-13 14:51:58,849 - mmdet - INFO - starting model initialization... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 1.7 task/s, elapsed: 12s, ETA: 0s2021-12-13 14:52:11,032 - mmdet - INFO - model initialization done. [>>>>>>>>>>>>>>>>>>>>>>>>>>] 4952/4952, 43.4 task/s, elapsed: 114s, ETA: 0s

---------------iou_thr: 0.5--------------- /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/meanap.py:203: DeprecationWarning: np.bool is a deprecated alias for the builtin bool. To silence this warning, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.boolhere. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), /opt/data/nfs/zhaozhiyuan/anaconda3/envs/mmfewshot/lib/python3.7/site-packages/mmdet/core/evaluation/mean_ap.py:203: DeprecationWarning:np.boolis a deprecated alias for the builtinbool. To silence this warning, useboolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.zeros(gt_bboxes.shape[0], dtype=np.bool), 2021-12-13 14:54:50,559 - mmfewshot - INFO - +-------------+------+-------+--------+-------+ | class | gts | dets | recall | ap | +-------------+------+-------+--------+-------+ | aeroplane | 285 | 2295 | 0.814 | 0.619 | | bicycle | 337 | 3526 | 0.849 | 0.777 | | boat | 263 | 9395 | 0.833 | 0.543 | | bottle | 469 | 9012 | 0.736 | 0.514 | | car | 1201 | 1505 | 0.765 | 0.651 | | cat | 358 | 565 | 0.810 | 0.683 | | chair | 756 | 3612 | 0.626 | 0.267 | | diningtable | 206 | 1 | 0.000 | 0.000 | | dog | 489 | 2665 | 0.906 | 0.694 | | horse | 348 | 7039 | 0.917 | 0.733 | | person | 4528 | 2932 | 0.540 | 0.477 | | pottedplant | 480 | 1070 | 0.554 | 0.336 | | sheep | 242 | 10288 | 0.897 | 0.535 | | train | 282 | 12340 | 0.926 | 0.520 | | tvmonitor | 308 | 11854 | 0.883 | 0.203 | | bird | 459 | 9924 | 0.730 | 0.379 | | bus | 213 | 9967 | 0.808 | 0.545 | | cow | 244 | 9278 | 0.910 | 0.187 | | motorbike | 325 | 7021 | 0.831 | 0.596 | | sofa | 239 | 11472 | 0.736 | 0.231 | +-------------+------+-------+--------+-------+ | mAP | | | | 0.474 | +-------------+------+-------+--------+-------+ 2021-12-13 14:54:50,564 - mmfewshot - INFO - BASE_CLASSES_SPLIT1 mAP: 0.5033199191093445 2021-12-13 14:54:50,564 - mmfewshot - INFO - NOVEL_CLASSES_SPLIT1 mAP: 0.3875313103199005 2021-12-13 14:54:50,599 - mmfewshot - INFO - Iter(val) [1238] AP50: 0.4740, BASE_CLASSES_SPLIT1: AP50: 0.5030, NOVEL_CLASSES_SPLIT1: AP50: 0.3880, mAP: 0.4744

linyq17 commented 2 years ago

Hi, I notice that your results use 4 gpus for training, which is different from our released model (8 gpus). Currently, the paper result can not be reproduced exactly because of the different selected novel data and randomness. Besides, the performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.

JulioZhao97 commented 2 years ago

Hi, I notice that your results use 4 gpus for training, which is different from our released model (8 gpus). Currently, the paper result can not be reproduced exactly because of the different selected novel data and randomness. Besides, the performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.

Thank you for your reply. But my real question is that why the reported results in README.md of meta-rcnn is much higher than results in paper? That is very confusing. If split1/1shot experiment of meta-RCNN and TFA in thier papers are conducted on different selected novel data and randomness, does results you guys reproduce implies that meta-RCNN is better than TFA?

JulioZhao97 commented 2 years ago

I run TFA and meta-RCNN using exact same data on my dataset, and the results show that meta-RCNN is better

linyq17 commented 2 years ago

The reported results in the paper only use one GPU and different fine-tuning strategies. You can refer to README.md of Meta-RCNN for more implementation details. The limited results do indicate that the meta-RCNN is better than TFA in some low shot situations. From my view, the generalization ability of fsod models can hardly be thoroughly evaluated by these limited observations.

JulioZhao97 commented 2 years ago

The reported results in the paper only use one GPU and different fine-tuning strategies. You can refer to README.md of Meta-RCNN for more implementation details. The limited results do indicate that the meta-RCNN is better than TFA in some low shot situations. From my view, the generalization ability of fsod models can hardly be thoroughly evaluated by these limited observations.

I know, thank you for your patience~

ztyxd commented 2 years ago

@JulioZhao97

The results of the META-RCNN I run are also better than the results of TFA. I think you will get better results of the Meta-RCNN if you choose a larger iterations.

open-mmlab / mmfewshot

about result reimplementation of meta-rcnn #25

TorchVision: 0.8.0 OpenCV: 4.5.4 MMCV: 1.4.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMDetection: 2.19.0+