ucbdrive / few-shot-object-detection

Implementations of few-shot object detection benchmarks
Apache License 2.0
1.08k stars 225 forks source link

Wrong Setting on LVIS #177

Open gaobb opened 2 years ago

gaobb commented 2 years ago

Hi,

Thanks for your interesting work.

The training procedure of the TFA generally includes 3 steps on MS-COCO and LVIS datasets as follows:

  1. train base model with base class images.
  2. fine-tune novel model with few-shot novel class images.
  3. combine the base weights from the base model with the novel weights, and then fine-tune with few-shot labeled images including base and novel classes.

Therefore, the novel model training should use the few-shot novel~(rare class) annotations (parts in lvis_v0.5_train_shots.json) on LVIS. However, the authors may mistakenly use all novel annotations (lvis_v0.5_train_rare.json) in the novel training stage (the second step).

Please refer to https://github.com/ucbdrive/few-shot-object-detection/blob/148a039af7abce9eff59d5cdece296ad1d2b8aa0/configs/LVIS-detection/faster_rcnn_R_101_FPN_fc_novel.yaml#L18

https://github.com/ucbdrive/few-shot-object-detection/blob/148a039af7abce9eff59d5cdece296ad1d2b8aa0/fsdet/data/builtin.py#L169-L171

Based on the above wrong setting, I can derive the approximate results with the TFA paper on LVIS dataset.

Method Backbone AP AP50 AP75 APs APm APl APr APc APf
TFA w/ fc (paper) R-101 25.4 41.8 27.0 19.8 31.1 39.2 15.5 26.0 28.6
TFA w/fc (Reproduction) R-101 25.2 41.6 26.5 19.6 31.1 39.8 15.6 25.5 28.6
If we modify the config file of the novel fine-tuning step and replace all novel annotations (lvis_v0.5_train_rare.json) with few-shot novel annotations, the results are as follows: Method Backbone AP AP50 AP75 APs APm APl APr APc APf
TFA w/fc (Reproduction) R-101 24.9 41.0 26.0 19.7 30.7 39.6 12.7 25.7 28.7

We can see that the results are worse than that of all novel annotations, especially on rare classes (12.7 vs. 15.6).

I would really appreciate it if the authors clarify the above points. Thanks.