uclanlp / visualbert

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
528 stars 104 forks source link

Flickr30k entities support #6

Closed fbrad closed 3 years ago

fbrad commented 4 years ago

Hi! Thanks for releasing the code, it is very useful! I wanted to play with the attention on the Flickr30k Entities dataset, but cannot load the entries in the Flickr30kFeatureDataset constructor (I think some .hdf5 and .pkl files are missing). Could you provide more details on how to instantiate this class?

Thanks!

liunian-harold-li commented 4 years ago

Hi, I am preparing for supporting Flickr30K and possible visualization but that might take some more time. For reference, the dataset processing largely follows https://github.com/jnhwkim/ban-vqa and most dataset files are generated using their processing scripts.