How to extract the bounding box or mask from the region described by OFA attention?

OFA-Sys / OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Apache License 2.0

2.39k stars 248 forks source link

How to extract the bounding box or mask from the region described by OFA attention? #392

Open lucaswamser opened 1 year ago

lucaswamser commented 1 year ago

Hello,

I am using OFA in a project and I would like to know how I can extract the region highlighted by the attention described in the generated caption by the model. For example, if the model describes two people with a dog posing for the photo, how can I obtain the bounding box or mask for that specific area of the photograph?

I would greatly appreciate any help or guidance you can provide.

Thank you in advance.

logicwong commented 1 year ago

@lucaswamser We have provided the inference procedures of visual grounding in Colab, can this help you?