Closed LinkToPast1990 closed 4 years ago
Hi, thanks for the interests to our work :)
dic_coco.npy
: As we wrote in our paper, the dic_coco.npy
actually is the pre-calculated confounder dictionary, which contains the averaged RoI feature of each category in MSCOCO (80 class). We use the pretrained faster rcnn model to generate it. Therefore the code is just the Faster R-CNN which can be found in maskrcnn-benchmark. Or you can just use your familiar codebase (e.g. mmdetection). After extracting the RoI feature of each image of MSCOCO, we just do the RoI feature vector average to get the dic_coco.npy
.
‘stat_prob.npy’. The stat_prob.npy
is just calculated with the appearence frequency of each object category in the MSCOCO dataset. That means, it can be calculated by just using the annotations of MSCOCO train 2014. For convenience, I use the cocoapi
:
And the key code is:
def p_z(z):
## z is the object label
catIds_ann = coco.getCatIds(catNms=[z])
annIds = coco.getAnnIds(catIds=catIds_ann, iscrowd=None)
## how many annotations of z in dataset
length = len(annIds)
## 604907 is the number of annotations in train2014
return length/604907
Yes, you are right! BTW, actually I have tried to extract pretrained feature for constructing z by both using ground-truth bounding box and just following the original pretrained faster r-cnn. And I found the difference of them and the performance in downstream tasks can be little, the probable reason is that they are averaged on the whole MSCOCO and the detection system nowadays can be much reliable. But I still think using the gt bounding box maybe little better.
Thanks! And besides z, the paper also uses the gt boxes to train the VC R-CNN while using the predicted box model at testing. It may lead to a train/test shift?
Hi, actually the VC R-CNN don't have the testing procedure (It's the feature extraction procedure). Moreover, the VC R-CNN training procedure is used to learn an image feature embeding. Then when extrating feature, our VC R-CNN can be regarded as a feature extractor and any bounding box coordinates can be ok.
Could you also share the code for making dic_coco.npy and the prior stat_prob.npy? Thanks
And in order to construct dic_coco.npy with ground-truth bboxes, I should modify the modeling/detector/generalized_rcnn.py in maskrcnn-benchmark as following, right?