fanq15 / FewX

FewX is an open-source toolbox on top of Detectron2 for data-limited instance-level recognition tasks.
https://github.com/fanq15/FewX
MIT License
346 stars 48 forks source link

why the config in the fine-tuning stage only set the K-shot as 9-shot? #47

Open ztyxd opened 3 years ago

ztyxd commented 3 years ago

Hello, why the config in the fine-tuning stage only set the K-shot as 9-shot?

The config is as below:

BASE: "Base-FSOD-C4.yaml" MODEL: WEIGHTS: "./output/fsod/R_50_C4_1x/model_final.pth" MASK_ON: False RESNETS: DEPTH: 50 BACKBONE: FREEZE_AT: 5 DATASETS: TRAIN: ("coco_2017_train_voc_10_shot",) TEST: ("coco_2017_val",) SOLVER: IMS_PER_BATCH: 4 BASE_LR: 0.001 STEPS: (2000, 3000) MAX_ITER: 3000 WARMUP_ITERS: 200 INPUT: FS: FEW_SHOT: True SUPPORT_WAY: 2 SUPPORT_SHOT: 9 MIN_SIZE_TRAIN: (440, 472, 504, 536, 568, 600) MAX_SIZE_TRAIN: 1000 MIN_SIZE_TEST: 600 MAX_SIZE_TEST: 1000 OUTPUT_DIR: './output/fsod/finetune_dir/R_50_C4_1x'

180041123-Atiq commented 7 months ago

I am not sure if I am getting you correctly or not, I also thought why shot number is 9 since they claimed to finetune their model with 10 shots. Its because cfg.FS.SUPPORT_SHOT is for support feature which probably get used for attention feature map (there is clear description of the terminologies in the paper). And the k-shot depends on how many samples/instances we register using fewx.data.builtin module. Hence k-shot is not the cfg.FS.SUPPORT_SHOT. And cfg.FS.SUPPORT_SHOT is 9 because if we use an image as query there can be only 9 other images left (assuming k-shot is 10) to include. You will be more clear if you go through the DatasetMapperWithSupport.generate_support() method.