Wrong Setting on LVIS - Githubissues

Hi，

Thanks for your interesting work.

The training procedure of the TFA generally includes 3 steps on MS-COCO and LVIS datasets as follows:

train base model with base class images.
fine-tune novel model with few-shot novel class images.
combine the base weights from the base model with the novel weights, and then fine-tune with few-shot labeled images including base and novel classes.

Therefore, the novel model training should use the few-shot novel~(rare class) annotations (parts in lvis_v0.5_train_shots.json) on LVIS. However, the authors may mistakenly use all novel annotations (lvis_v0.5_train_rare.json) in the novel training stage (the second step).

Please refer to https://github.com/ucbdrive/few-shot-object-detection/blob/148a039af7abce9eff59d5cdece296ad1d2b8aa0/configs/LVIS-detection/faster_rcnn_R_101_FPN_fc_novel.yaml#L18

https://github.com/ucbdrive/few-shot-object-detection/blob/148a039af7abce9eff59d5cdece296ad1d2b8aa0/fsdet/data/builtin.py#L169-L171

Based on the above wrong setting, I can derive the approximate results with the TFA paper on LVIS dataset.

Method	Backbone	AP	AP50	AP75	APs	APm	APl	APr	APc	APf
TFA w/ fc (paper)	R-101	25.4	41.8	27.0	19.8	31.1	39.2	15.5	26.0	28.6
TFA w/fc (Reproduction)	R-101	25.2	41.6	26.5	19.6	31.1	39.8	15.6	25.5	28.6

If we modify the config file of the novel fine-tuning step and replace all novel annotations (lvis_v0.5_train_rare.json) with few-shot novel annotations, the results are as follows:	Method	Backbone	AP	AP50	AP75	APs	APm	APl	APr	APc	APf
TFA w/fc (Reproduction)	R-101	24.9	41.0	26.0	19.7	30.7	39.6	12.7	25.7	28.7

We can see that the results are worse than that of all novel annotations, especially on rare classes (12.7 vs. 15.6).

I would really appreciate it if the authors clarify the above points. Thanks.

ucbdrive / few-shot-object-detection

Wrong Setting on LVIS #177