GuangxingHan / FCT

Code for CVPR 2022 Oral paper: 'Few-Shot Object Detection with Fully Cross-Transformer'
72 stars 13 forks source link

Can you help me see why this problem occurs #23

Closed ThelilinNB closed 1 month ago

ThelilinNB commented 6 months ago
When I changed the data set to FSOD, the following results appeared: AP AP50 AP75 bAP bAP50 bAP75 nAP nAP50 nAP75
0.001 0.002 0.001 0.001 0.002 0.001 0.000 0.000 0.000

[12/25 23:43:20] d2.evaluation.testing INFO: copypaste: Task: bbox [12/25 23:43:20] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75 [12/25 23:43:20] d2.evaluation.testing INFO: copypaste: 0.0009,0.0017,0.0009,0.0009,0.0017,0.0009,0.0000,0.0000,0.0000 I don't know. Why aren't these categories detectable。 Detailed Settings are as follows: [12/25 16:59:43] detectron2 INFO: Contents of args.config_file=configs/fsod/two_branch_10shot_finetuning_pascalvoc_split1_pvt_v2_b2_li.yaml: BASE: "Base-FSOD-C4.yaml" MODEL: PIXEL_MEAN: [103.530, 116.280, 123.675] PIXEL_STD: [57.375, 57.120, 58.395] WEIGHTS: "/data/master21/lipl/FCT-main/FCT_model_final_voc_split1.pth" MASK_ON: False RESNETS: DEPTH: 101 BACKBONE: FREEZE_AT: 4 NAME: "build_FCT_backbone" TYPE: "pvt_v2_b2_li" TRAIN_BRANCH_EMBED: False ROI_HEADS: SCORE_THRESH_TEST: 0.0 RPN: PRE_NMS_TOPK_TEST: 12000 POST_NMS_TOPK_TEST: 100 DATASETS: TRAIN: ("voc_2027_trainval_all1_10shot",) TEST: ("voc_2027_test_all1",) TEST_KEEPCLASSES: 'all1' SOLVER: IMS_PER_BATCH: 8 BASE_LR: 0.00002 STEPS: (4500, 5000) MAX_ITER: 6000 WARMUP_ITERS: 200 CHECKPOINT_PERIOD: 5000 INPUT: FS: FEW_SHOT: True SUPPORT_WAY: 5 SUPPORT_SHOT: 10 MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800) MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 600 MAX_SIZE_TEST: 1000 OUTPUT_DIR: './output/fsod/finetune_dir/two_branch_10shot_finetuning_pascalvoc_split1_pvt_v2_b2_li' TEST: EVAL_PERIOD: 4500

[12/25 16:59:43] detectron2 INFO: Running with full config: CUDNN_BENCHMARK: False DATALOADER: ASPECT_RATIO_GROUPING: True FILTER_EMPTY_ANNOTATIONS: True NUM_WORKERS: 8 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler DATASETS: PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: () PROPOSAL_FILES_TRAIN: () SEEDS: 0 TEST: ('voc_2027_test_all1',) TEST_KEEPCLASSES: all1 TEST_SHOTS: (1, 2, 3, 5, 10, 30) TRAIN: ('voc_2027_trainval_all1_10shot',) GLOBAL: HACK: 1.0 INPUT: CROP: ENABLED: False SIZE: [0.9, 0.9] TYPE: relative_range FORMAT: BGR FS: FEW_SHOT: True SUPPORT_EXCLUDE_QUERY: False SUPPORT_SHOT: 10 SUPPORT_WAY: 5 MASK_FORMAT: polygon MAX_SIZE_TEST: 1000 MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 600 MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800) MIN_SIZE_TRAIN_SAMPLING: choice RANDOM_FLIP: horizontal MODEL: ANCHOR_GENERATOR: ANGLES: [[-90, 0, 90]] ASPECT_RATIOS: [[0.5, 1.0, 2.0]] NAME: DefaultAnchorGenerator OFFSET: 0.0 SIZES: [[32, 64, 128, 256, 512]] BACKBONE: FREEZE_AT: 4 NAME: build_FCT_backbone ONLY_TRAIN_NORM: False TRAIN_BRANCH_EMBED: False TYPE: pvt_v2_b2_li DEVICE: cuda FPN: FUSE_TYPE: sum IN_FEATURES: [] NORM: OUT_CHANNELS: 256 KEYPOINT_ON: False LOAD_PROPOSALS: False MASK_ON: False META_ARCHITECTURE: FsodRCNN PANOPTIC_FPN: COMBINE: ENABLED: True INSTANCES_CONFIDENCE_THRESH: 0.5 OVERLAP_THRESH: 0.5 STUFF_AREA_LIMIT: 4096 INSTANCE_LOSS_WEIGHT: 1.0 PIXEL_MEAN: [103.53, 116.28, 123.675] PIXEL_STD: [57.375, 57.12, 58.395] PROPOSAL_GENERATOR: MIN_SIZE: 0 NAME: FsodRPN RESNETS: DEFORM_MODULATED: False DEFORM_NUM_GROUPS: 1 DEFORM_ON_PER_STAGE: [False, False, False, False] DEPTH: 101 NORM: FrozenBN NUM_GROUPS: 1 OUT_FEATURES: ['res4'] RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: True WIDTH_PER_GROUP: 64 RETINANET: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0) FOCAL_LOSS_ALPHA: 0.25 FOCAL_LOSS_GAMMA: 2.0 IN_FEATURES: ['p3', 'p4', 'p5', 'p6', 'p7'] IOU_LABELS: [0, -1, 1] IOU_THRESHOLDS: [0.4, 0.5] NMS_THRESH_TEST: 0.5 NORM: NUM_CLASSES: 80 NUM_CONVS: 4 PRIOR_PROB: 0.01 SCORE_THRESH_TEST: 0.05 SMOOTH_L1_LOSS_BETA: 0.1 TOPK_CANDIDATES_TEST: 1000 ROI_BOX_CASCADE_HEAD: BBOX_REG_WEIGHTS: ((10.0, 10.0, 5.0, 5.0), (20.0, 20.0, 10.0, 10.0), (30.0, 30.0, 15.0, 15.0)) IOUS: (0.5, 0.6, 0.7) ROI_BOX_HEAD: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0) CLS_AGNOSTIC_BBOX_REG: False CONV_DIM: 256 FC_DIM: 1024 NAME: NORM: NUM_CONV: 0 NUM_FC: 0 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 SMOOTH_L1_BETA: 0.0 TRAIN_ON_PRED_BOXES: False ROI_HEADS: BATCH_SIZE_PER_IMAGE: 128 FREEZE_ROI_FEATURE_EXTRACTOR: False IN_FEATURES: ['res4'] IOU_LABELS: [0, 1] IOU_THRESHOLDS: [0.5] NAME: FsodRes5ROIHeads NMS_THRESH_TEST: 0.5 NUM_CLASSES: 1 ONLY_TRAIN_NORM: False POSITIVE_FRACTION: 0.5 PROPOSAL_APPEND_GT: True SCORE_THRESH_TEST: 0.0 ROI_KEYPOINT_HEAD: CONV_DIMS: (512, 512, 512, 512, 512, 512, 512, 512) LOSS_WEIGHT: 1.0 MIN_KEYPOINTS_PER_IMAGE: 1 NAME: KRCNNConvDeconvUpsampleHead NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: True NUM_KEYPOINTS: 17 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 ROI_MASK_HEAD: CLS_AGNOSTIC_MASK: False CONV_DIM: 256 NAME: MaskRCNNConvUpsampleHead NORM: NUM_CONV: 0 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 RPN: BATCH_SIZE_PER_IMAGE: 256 BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0) BOUNDARY_THRESH: -1 FREEZE_RPN: False HEAD_NAME: StandardRPNHead IN_FEATURES: ['res4'] IOU_LABELS: [0, -1, 1] IOU_THRESHOLDS: [0.3, 0.7] LOSS_WEIGHT: 1.0 NMS_THRESH: 0.7 POSITIVE_FRACTION: 0.5 POST_NMS_TOPK_TEST: 100 POST_NMS_TOPK_TRAIN: 2000 PRE_NMS_TOPK_TEST: 12000 PRE_NMS_TOPK_TRAIN: 12000 SMOOTH_L1_BETA: 0.0 SEM_SEG_HEAD: COMMON_STRIDE: 4 CONVS_DIM: 128 IGNORE_VALUE: 255 IN_FEATURES: ['p2', 'p3', 'p4', 'p5'] LOSS_WEIGHT: 1.0 NAME: SemSegFPNHead NORM: GN NUM_CLASSES: 54 WEIGHTS: /data/master21/lipl/FCT-main/FCT_model_final_voc_split1.pth OUTPUT_DIR: ./output/fsod/finetune_dir/two_branch_10shot_finetuning_pascalvoc_split1_pvt_v2_b2_li SEED: -1 SOLVER: AMP: ENABLED: False BASE_LR: 2e-05 BIAS_LR_FACTOR: 1.0 CHECKPOINT_PERIOD: 5000 CLIP_GRADIENTS: CLIP_TYPE: value CLIP_VALUE: 1.0 ENABLED: False NORM_TYPE: 2.0 GAMMA: 0.1 HEAD_LR_FACTOR: 2.0 IMS_PER_BATCH: 4 LR_SCHEDULER_NAME: WarmupMultiStepLR MAX_ITER: 6000 MOMENTUM: 0.9 NESTEROV: False REFERENCE_WORLD_SIZE: 0 SOLVER_TYPE: adamw STEPS: (4500, 5000) WARMUP_FACTOR: 0.1 WARMUP_ITERS: 200 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0001 WEIGHT_DECAY_BIAS: 0.0001 WEIGHT_DECAY_NORM: 0.0 TEST: AUG: ENABLED: False FLIP: True MAX_SIZE: 4000 MIN_SIZES: (400, 500, 600, 700, 800, 900, 1000, 1100, 1200) DETECTIONS_PER_IMAGE: 100 EVAL_PERIOD: 4500 EXPECTED_RESULTS: [] KEYPOINT_OKS_SIGMAS: [] PRECISE_BN: ENABLED: False NUM_ITER: 200 VERSION: 2 VIS_PERIOD: 0