How to train M4C on InforgrahicVQA ?

facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

https://mmf.sh/

Other

5.49k stars 935 forks source link

How to train M4C on InforgrahicVQA ? #1092

Closed Pxtri2156 closed 3 years ago

Pxtri2156 commented 3 years ago

❓ Questions and Help

Hi, everyone. I would like to train M4C on InforgraphicVQA. However, I don't know what to build a dataset class for InforgraphicVQA. Moreover, I wonder if I need to modify any configuration for training? I have read the the guide line at here, but I still do not understand how to train M4C on new datasets. Thereore, I hope, everyone can help me to solve that problem. Thank you for your attention.

apsdehal commented 3 years ago

@ronghanghu Can you help out here? Maybe point to your older comments on how to do this?

ronghanghu commented 3 years ago

Hi, one can run M4C on other datasets by building these datasets into imdbs and extracting their objects and OCR features. Basically, other datasets can be used by making them into similar formats as the TextVQA dataset. One can follow the steps in https://github.com/facebookresearch/mmf/issues/663#issuecomment-883000371.

Pxtri2156 commented 3 years ago

@apsdehal @ronghanghu I really appreciate you. I trained M4C on InforgraphicVQA by following the steps in #663.

soonchangAI commented 3 years ago

@ronghanghu Hi, is there a script to generate vocab file such as fixed_answer_vocab_textvqa_5k.text for TextVQA?

ronghanghu commented 3 years ago

Hi @soon22, I don't have a specific script to generate this vocab file, but it is basically tokenizing all the answers in the TextVQA dataset with a simple tokenizer and take the most frequent 5000 tokens, using the following tokenizer: https://github.com/facebookresearch/mmf/blob/3947693aafcc9cc2a16d7c1c5e1479bf0f88ed4b/mmf/utils/text.py#L64-L79