Open Bilibilee opened 8 months ago
It's intentional. We need to make sure both methods used the same set of region proposals to fairly verify that self distillation is better than noisy region-text pairs. Kindly note that category ids were not used during CLIPSelf training even they were in the json.
@wusize Hi, I am curious about how to obtain the region proposals and corresponding category ids when the model is only trained on the base categories? I found the category ids in coco_proposals.json are numerous.
@wusize Hi, I am curious about how to obtain the region proposals and corresponding category ids when the model is only trained on the base categories? I found the category ids in coco_proposals.json are numerous.
Hi! Please refer to A.4 in the appendix of the paper. You can also have a look at the data preparation of VLDet or RegionCLIP.
@wusize Thanks for your quick reply. I have read Appendix A.4 In this paper and checked the data preparation of VLDet, but there is no information about how to generate the region proposals.
I would like to leverage the coco_proposals.json to improve my project, thus I need to understand how the coco_proposals.json are generated. Can you provide some information on how to obtain coco_proposals.json or where you downloaded it?
Great thanks again!
I got it. Thanks!
- Train an PRN on base categories of COCO or obtain the RPN part of any off-the-shelf ov detector trained on coco.
- Use the RPN to generate proposals.
- Extract CLIP image embeddings for these proposals.
- Parse each COCO caption into a group of nouns or phrases.
- Extract CLIP text embeddings for these nouns/phrases.
- Do bipartite matching between the image embeddings and text embeddings.
@wusize Hi, when I checked the generated proposals coco_pseudo_4764.json
, I found there are many differences in the category ids between the coco_pseudo_4764.json
and that in VLDet. For example:
The number of category ids is smaller than that in VLDet, so the last step (6) is not a simple bipartite matching. Do you have some filter operation? Hope you can give me some suggestions. Many thanks to you.
in this Drive,Is it intentional or a mistake that coco_proposals.json and coco_pseudo_4764.json are completely identical.