Open Han-YeJi opened 2 years ago
I don't think the attention boxes were ever released. I think they were just for the annotators so they could focus their annotation, but I didn't see anything in the actual paper that fed that information to the model (definitely correct me if I'm wrong). Regardless, if you're looking for a version of RICO Screen2Words with the images, captions, and all associated metadata, there's a HuggingFace Dataset version here: https://huggingface.co/datasets/hheiden-roots/RICO-Screen2Words
It doesn't have attention boxes (because I don't think they were released), but it has basically everything else.
I need labels to do Screen2words experiments. labels should contain captions and attention boxes.
Where is the file containing attn boxes?