Why did I run so much higher results than the author's paper regarding the passcalvoc experiment results of the fewx baseline?

yy049 commented 1 year ago

Hello author! I would like to ask, when I ran the code for Fewx using a graphics card with 4 24g memory (which is the baseline you used), I found that in the 10 shot fine-tuning of Pascalvoc split 1, nAP50 was 63.7%, which is significantly different from the 58.6% given in your paper. What is the reason for this?

Here is the result of my run: [11/26 12:53:47] d2.evaluation.evaluator INFO: Inference done 1224/1238. 0.9081 s / img. ETA=0:00:12 [11/26 12:53:52] d2.evaluation.evaluator INFO: Inference done 1232/1238. 0.9063 s / img. ETA=0:00:05 [11/26 12:53:56] d2.evaluation.evaluator INFO: Total inference time: 0:18:43.905460 (0.911521 s / img per device, on 4 devices) [11/26 12:53:56] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:18:35 (0.904586 s / img per device, on 4 devices) [11/26 12:54:51] d2.evaluation.testing INFO: copypaste: Task: bbox [11/26 12:54:51] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75 [11/26 12:54:51] d2.evaluation.testing INFO: copypaste: 39.1364,69.6365,40.0869,40.2169,71.6110,41.1615,35.8949,63.7130,36.8634 [11/26 12:54:51] d2.utils.events INFO: iter: 0 total_loss: 0.108 loss_cls: 0.026 loss_box_reg: 0.040 loss_rpn_cls: 0.034 loss_rpn_loc: 0.008 data_time: 21.6117 lr: 0.000001 max_mem: 6831M [11/26 12:54:51] d2.engine.hooks INFO: Total training time: 0:20:06 (0:20:06 on hooks)

The configuration file for meta training is the same as the one you provided.

yy049 commented 1 year ago

I only ran meta Training Pascalvoc Split 1 Resnet101 Stage 1. yaml, and then use this weight file to fine tune the 10 shot, resulting in 63.7% nAP50.

Meta Training Pascalvoc Split 1 Resnet101 Stage 1. Yaml is the same as what you provided.

The running instructions are as follows: CUDA_VISIBLE_DEVICES=0,1,2,3 python3 fsod_train_net_fewx.py --num-gpus 4 --dist-url auto \ --config-file configs/fsod/10shot_finetune_pascalvoc_split1_resnet101.yaml --resume SOLVER.IMS_PER_BATCH 8 2>&1 | tee log/10shot_finetune_pascalvoc_split1_resnet101.txt

The configuration file for 10-shot fine-tuning is as follows: BASE: "Base-FSOD-C4.yaml" MODEL: WEIGHTS: "./output/fsod/meta_training_pascalvoc_split1_resnet101_stage_1/model_final.pth" MASK_ON: False RESNETS: DEPTH: 101 BACKBONE: FREEZE_AT: 5 ROI_HEADS: SCORE_THRESH_TEST: 0.0 RPN: PRE_NMS_TOPK_TEST: 12000 POST_NMS_TOPK_TEST: 100 DATASETS: TRAIN: ("voc_2007_trainval_all1_10shot",) TEST: ("voc_2007_test_all1",) TEST_KEEPCLASSES: 'all1' SOLVER: IMS_PER_BATCH: 8 BASE_LR: 0.001 STEPS: (2000, 3000) MAX_ITER: 3000 WARMUP_ITERS: 200 CHECKPOINT_PERIOD: 3000 INPUT: FS: FEW_SHOT: True SUPPORT_WAY: 5 SUPPORT_SHOT: 10 MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800) MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 600 MAX_SIZE_TEST: 1000 OUTPUT_DIR: './output/fsod/finetune_dir/10shot_finetune_pascalvoc_split1_resnet101_fewx' TEST: EVAL_PERIOD: 3000

GuangxingHan commented 1 year ago

I think you should get the right result. Note that the implementation details/hyper-parameters in this codebase are better designed/tuned, which are different from the original FewX repo (reported in the table).

Also, our repo produces higher results than the paper (at the time of submission) with our final GCN model.

GuangxingHan / QA-FewDet

Why did I run so much higher results than the author's paper regarding the passcalvoc experiment results of the fewx baseline? #9