yanxp / MetaR-CNN

Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning
https://yanxp.github.io/metarcnn.html
179 stars 23 forks source link

Report data leakage that causes unfair few-shot setting #36

Open Ze-Yang opened 4 years ago

Ze-Yang commented 4 years ago

I think there exists a data leakage problem in your code, which severely hurts the fair comparison with other methods.

Take coco 10shot for example, you will first construct a meta-data set encomprising 30 (3x shots) (prn_image, prn_mask) pairs for each class. The image indexes of these sampled images are saved in a file named annotations/instances_shot2014.json.

After that, roidb needs to be contructed to provide query samples for finetuning purpose. According to the definition of few-shot setting, you can only access to the N-shot (N instances per class) data no matter whether you perform finetuning or not. Therefore, the roidb dataset should contain the same instances as in meta-data set, otherwise it will exceed the designated number of shots. However, as I find in your code, you does not save the anno_index of the selected instances in meta-data set. Instead, you again randomly sample the shot instances from the images list indicated in annotations/instances_shot2014.json. In this case, I am concern about how you guarantee that the newly sampled instances are exactly the same as the ones in meta-data set.

Hope that you can check about this issue and clarify my concern. Thanks a lot.