jackroos / VL-BERT

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
MIT License
735 stars 110 forks source link

How is vgbua_res101_precomputed generated? #57

Closed lmd1993 closed 3 years ago

lmd1993 commented 3 years ago

I have a question about how the vgbua_res101_precomputed is generated. Is it generated by ground truth regions and you run the Fast R-CNN to convert the regions to vectors (features)? Thanks.

jackroos commented 3 years ago

@lmd1993 No. It is generated by running inference of pre-trained Faster RCNN (provided on this repo) on COCO images. Both boxes and features are obtained by the Faster RCNN, while ground-truth regions are not utilized.