MikeWangWZHL / VidIL

Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
MIT License
112 stars 1 forks source link

The linked files and paper descriptions contains different numbers of Object and Attribute Synset #11

Closed fake-warrior8 closed 1 year ago

fake-warrior8 commented 1 year ago

Hi, I downloaded the linked files and found 2 attributes files, attribute_syssets.json, attribute_synset_values.json, each file has 18720, 5959 items, while the paper shows that there are 16,693 items. Similarly, there are 3 object files, object_synsets.json, object_synset_values.json, object_alias.txt, each has 40154, 5774, 3435 items, while the paper shows 7414 items. why is there a difference?

MikeWangWZHL commented 1 year ago

Thanks for the question! Yes, those are the original files from visual genome; you can find the filtered files here: https://github.com/MikeWangWZHL/VidIL/tree/main/visual_token_ontology/vg


The cleaning process are detailed in the Appendix as follows:

image