facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.5k stars 939 forks source link

Questions about VisualBert models #617

Closed ChenyuGAO-CS closed 1 year ago

ChenyuGAO-CS commented 4 years ago

❓ Questions and Help

Hi, I found there are several models' links in models.yaml, which one corresponds to the Task-specific Pre-training (on VQA 2.0) model of VisualBert, ie, after COCO pretrain and then VQA pretrain? Is there an accuracy report of three versions of VisualBert on VQA 2.0 under MMF?

  1. COCO pretrain
  2. COCO pretrain + VQA pretrain
  3. COCO pretrain + VQA pretrain + VQA finetune
apsdehal commented 4 years ago

Hi, We didn't do task specific pretraining on VisualBERT for our paper https://arxiv.org/abs/2004.08744 as our experiments suggested it wasn't as beneficial as the time required to do it. But, you can do it on your own in MMF.

apsdehal commented 4 years ago

in the model zoo, we have