ucbdrive / few-shot-object-detection

Implementations of few-shot object detection benchmarks
Apache License 2.0
1.1k stars 223 forks source link

Why using things_classes instead of novel_classes in few-shot training? #90

Closed davesie closed 3 years ago

davesie commented 3 years ago

Hello,

why are you using things_classes (these are all classes) instead of just the few-shot classes? https://github.com/ucbdrive/few-shot-object-detection/blob/d460bc2aeffb88baeb23993f0b071a0e52aaa1e2/fsdet/data/meta_coco.py#L47-L50

I thought just the novel classes are used for the few-shot training. Otherwise the model would have seen the classes already that are used in few-shot training, right?

scott-vsi commented 3 years ago

It is odd how this works, perhaps unintentionally...

metadata is a dictionary created in _get_coco_fewshot_instances_meta called from register_all_coco. It contains three keys:

(along with an accompanying [thing|base|novel]_dataset_id_to_contiguous_id key)

metadata["thing_classes"] = metadata["base_classes"] + metadata["novel_classes"]

In fsdet/data/meta_coco.register_meta_coco, DatasetCatalog.register is called to register, as you note, all thing_classes annotations. However, it turns out load_coco_json is actually called lazily when DatasetCatalog.get is called.

Consequently, if the (dataset) name (the DATASETS.TRAIN key of a yaml config) contains _base, register_meta_coco is able to first set metadata["thing_classes"] to be metadata["base_classes"] instead (similarly for thing_dataset_id_to_contiguous_id).

https://github.com/ucbdrive/few-shot-object-detection/blob/6b0769b5d682fbf7fdcdaed0c1d0dfd51c373468/fsdet/data/meta_coco.py#L130-L135

For example, if DATASETS.TRAIN is coco_trainval_base (as in configs/COCO-detection/faster_rcnn_R_101_FPN_base.yaml; used in Stage 1: Base Training), then metadata["thing_classes"] is set to the base classes. When load_coco_json runs, it loads cocosplit/datasplit/trainvalno5k.json (setup here), which contains annotations from all of the thing classes. The annotations are then filtered to only the thing_classes (aka base_classes) at

https://github.com/ucbdrive/few-shot-object-detection/blob/6b0769b5d682fbf7fdcdaed0c1d0dfd51c373468/fsdet/data/meta_coco.py#L115-L116

Similarly, if the DATASETS.TRAIN key of a (n-shot) yaml config file contains _novel (e.g., coco_trainval_novel_1shot in configs/COCO-detection/faster_rcnn_R_101_FPN_ft_novel_1shot.yaml, used in Stage 2: Novel Weights), then metadata["thing_classes"] is set to the novel classes.

In this case, the code path you referenced is followed and the class-specific n-shot json files in the cocosplit directory for each things_classes (aka novel_classes; e.g., datasets/cocosplit/full_box_1shot_airplane_trainval.json, etc.) are loaded into the DatasetCatalog by load_coco_json.

https://github.com/ucbdrive/few-shot-object-detection/blob/6b0769b5d682fbf7fdcdaed0c1d0dfd51c373468/fsdet/data/meta_coco.py#L47-L50

In the final scenario, if the DATASETS.TRAIN key of a (n-shot) yaml config file contains _all (e.g., coco_trainval_all_1shot in configs/COCO-detection/faster_rcnn_R_101_FPN_ft_all_1shot.yaml, used in Stage 2: Fine tuning), then metadata["thing_classes"] is not modified and all of the class-specific few-shot json files in the cocosplit directory are loaded into the DatasetCatalog by load_coco_json.

EDIT This can be verified with:

import fsdet.data.builtin # calls register_all_coco > register_meta_coco > DatasetCatalog.register
from fsdet.data.meta_coco import DatasetCatalog
assert len(set([annotations['category_id']
        for f in DatasetCatalog.get('coco_trainval_base') for annotations in f['annotations']])) == 60
assert len(set([annotations['category_id']
        for f in DatasetCatalog.get('coco_trainval_novel_1shot') for annotations in f['annotations']])) == 20
assert len(set([annotations['category_id']
        for f in DatasetCatalog.get('coco_trainval_all_1shot') for annotations in f['annotations']])) == 80
thomasehuang commented 3 years ago

You are exactly right, thank you for the detailed explanation!