mandal4 commented 3 years ago

Thanks for releasing the source codes, I have two questions about data setting.

How did you choose K-shot samples in dataset? Did you use novel K-shot samples identical with TFA (ICLR2020)? LINK: http://dl.yf.io/fs-det/datasets/
Is there any experimental results with PascalVOC ? (in-domain few-shot)

Thanks,

michaelku1 commented 2 years ago

I have gone through the code myself to the extent that I think I may answer your question. According to their paper, it seems that their model (DAnA) and all other models re-implemented are supposed to be trained with the "two-way constrastive learning strategy," introduced by the FSOD paper (Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector by Qi Fan, et al.).

It seems to me that due to the way the learning strategy is used, only one class is randomly sampled, even though there are different classes from gt_boxes. It basically first loads all classes stored in gt_boxes, randomly sample one, and then use the class to index the support set. You may also notice that by defining k-number of shots, you basically define how many samples you want to take from that specific support category. All in all, the support directory is where they acquire the support set. Check in the code how support examples are added to the support_pool and you may find your answer there.
I don't think there is any experimental results with pascal voc. I am also quite surprised that they didn't do an experiment on the three base/novel splits experiment on pascal voc, which I think is quite common in the FSOD literature.

Tung-I commented 2 years ago

michaelku1's insights into the training process is completely correct. I am amazed that you went through the code so clearly. As for pascal voc, I do use it to train my model and test it on COCO (the pascal2COCO experiment) but I did not leverage it as test data. It is rather common to test models on pascal but I think the dataset is too easy to verify the efficacy of FSOD (often only one major object in an image).

Tung-I / Dual-awareness-Attention-for-Few-shot-Object-Detection

Some questions about data settings #9