Closed he-y closed 4 years ago
For VQA, the code will be released under mmf (see a beta version at https://github.com/facebookresearch/mmf/tree/vqa_winner). For captioning, I think we directly used captioning repo from BUTD? We are providing the features already, it should not be too hard to put up a pipeline for captioning.
In the paper, we used Pythia (which is now in MMF) and MCAN (https://github.com/MILVLG/mcan-vqa) for VQA. For captioning, we used Pythia. It is straightforward to replace the region features with our grid version for such models.
End-to-end trained VQA will be provided in MMF.
Thanks for the great work! Will you release the downstream VQA and Caption code? If so, when? Thank you.