airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
923 stars 157 forks source link

VQA dataset #75

Closed deepaknlp closed 4 years ago

deepaknlp commented 4 years ago

Hi @airsplay,

Thank you for releasing the source code and model. I would like to know, how did you obtain the following JSON files from the official VQA2.0 questions and annotations.?

image

Thanks,

airsplay commented 4 years ago

We take the pre-processing process in https://github.com/hengyuan-hu/bottom-up-attention-vqa and convert the results in a unified JSON format.

deepaknlp commented 4 years ago

Thank you @airsplay for the quick response much appreciated.