microsoft / Oscar

Oscar and VinVL
MIT License
1.04k stars 252 forks source link

How can i generate caption for any(my) image? #19

Closed JoonseoKang closed 4 years ago

JoonseoKang commented 4 years ago

In coco_caption dataset, train.yaml file shows that train.img.tsv is an image, but i couldn`t found train.img.tsv.

  1. Where can i find the train(val or test).img.tsv?

feature: train.feature.tsv

391895 {"num_boxes": 37, "features": "W6aDPlMKLj6FySc9zdycPyewQj7zsqw/8FjLQE+ABUEspTk+AAAAAEg0Dz8FzHo

  1. Can you explain how you changed original image to feature?

What I want to do is look at the image, caption example, like in your paper Fig5.

  1. how can i generate caption for any sample image? image
xiyinmsu commented 4 years ago

Currently we only support to inference with image features. You can use the bottom-up top-down approach to extract features and labels first and then use our pipeline to generate the captions.

amil-rp-work commented 4 years ago

Hey @xiyinmsu Could the following repo work for feature extraction https://github.com/airsplay/py-bottom-up-attention?