uclanlp / visualbert

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
529 stars 104 forks source link

Extracting Detectron Features #1

Open sanjayss34 opened 5 years ago

sanjayss34 commented 5 years ago

Hi, thanks for releasing your code soon after your paper and for making your evaluations easy to reproduce! Could you please provide more detail on how you extracted the Detectron features? I don't see a straightforward way to extract the features with the existing code in the Detectron repository. Thanks!

liunian-harold-li commented 5 years ago

Thank you for your interest! I just added the code I used for extracting features for NLVR2. Hope you could find that useful!

MinghuiAn commented 5 years ago

Hi, can you provide the code about how to extract the features in text and pictures, I will very appreciated for the code , thanks !

liunian-harold-li commented 5 years ago

Hi, thank you for your interest in our code! Just to be sure, do you mean extracting fixed hidden features from the transformer given an image and text? If so, sorry that I did not implement that particular function. I would suggest taking a look at pytorch_pretrained_bert/modeling.py at line 1326. I think "encoded_layers" should be the hidden features of the transformer. Please feel free to comment should you have any more questions!

alice-cool commented 3 years ago

Thank you for your interest! I just added the code I used for extracting features for NLVR2. Hope you could find that useful!

VQA also use the detectron fitures I found in your Readme.md. It is about 160g by using wget

yezhengli-Mr9 commented 3 years ago

Thank you for your interest! I just added the code I used for extracting features for NLVR2. Hope you could find that useful!

VQA also use the detectron fitures I found in your Readme.md. It is about 160g by using wget

Hi @alice-cool, @MinghuiAn, @sanjayss34 , how is the speed of extracting image features?

For example, by comparison with one GPU (cpu-only is presumably not tolerable), for NLVR2 107,292 images, lxmert takes 5-6 hours to extract faster-rcnn features by this caffe.

I also follow #10.

lifebl commented 3 years ago

@sanjayss34 @liunian-harold-li @yezhengli-Mr9 Hello, could you explain in detail how to extract the image features of 1024 dimensions? I have some new pictures now. When I run extract_image_features.py, it reports the following error: /visualbert/detectron/core/test.py", line 145, in im_detect_bbox hashes = np.round(inputs['rois'] * cfg.DEDUP_BOXES).dot(v) KeyError: 'rois'

bigbrother001 commented 2 years ago

@sanjayss34 @liunian-harold-li @yezhengli-Mr9 Hello, could you explain in detail how to extract the image features of 1024 dimensions? I have some new pictures now. When I run extract_image_features.py, it reports the following error: /visualbert/detectron/core/test.py", line 145, in im_detect_bbox hashes = np.round(inputs['rois'] * cfg.DEDUP_BOXES).dot(v) KeyError: 'rois'

Hello, could you tell me how to get image features for my pictures? I don't know how to make it.

bigbrother001 commented 2 years ago

@sanjayss34 @liunian-harold-li @yezhengli-Mr9 Hello, could you explain in detail how to extract the image features of 1024 dimensions? I have some new pictures now. When I run extract_image_features.py, it reports the following error: /visualbert/detectron/core/test.py", line 145, in im_detect_bbox hashes = np.round(inputs['rois'] * cfg.DEDUP_BOXES).dot(v) KeyError: 'rois'

Is there a file "/visualbert/detectron/core/test.py"? I can't find it in the project