peteanderson80 / bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
http://panderson.me/up-down-attention/
MIT License
1.43k stars 378 forks source link

Illustrations for object and attribute predictions #35

Open mrfarazi opened 6 years ago

mrfarazi commented 6 years ago

Hello @peteanderson80

I was wondering how to generate the figure showing the object and attribute predictions for salient image regions with bounding boxes and labels (like the figure in this repo).

ZhuFengdaaa commented 6 years ago

Asking for the visualization code. +1

peteanderson80 commented 6 years ago

There is some code in our caption model repo for this: https://github.com/peteanderson80/Up-Down-Captioner/blob/master/scripts/demo.ipynb

liiiiiiiiil commented 6 years ago

There is some code in our caption model repo for this: https://github.com/peteanderson80/Up-Down-Captioner/blob/master/scripts/demo.ipynb

Did you give the 'objects' and 'attrs' in your tsv file? I try to get the image tag in your tsv file, but got 'None'

peteanderson80 commented 6 years ago

No, I don't think the object and attribute labels are in the tsv files. In hindsight, we should have included them. The demo above is end-to-end, i.e. running Faster R-CNN on the image (not using precomputed features).

Rushing-Life commented 2 years ago

Hi, can you share the tsv file with object and attribute tags! Thank you very much for your contribution!