Closed BierOne closed 3 years ago
Hey @BierOne - thanks for reaching out!
There’s one key difference between the LXMERT checkpoint in this repository vs. in Hao Tan’s original — specifically the original LXMERT for VQA model implemented in the paper pretrains on both VQA-2 and GQA.
For the active learning work here, we didn’t think this was a fair comparison, so we asked Hao for a checkpoint that did not pretrain on these VQA datasets (but other datasets for image captioning for example). This is why the numbers are lower; hope this makes sense!
Thanks for the prompt reply!
I totally understand this difference. However, I also obtained the similar results (62% validation score) when I used the model pre-trained on the VQA datasets. This is really strange for me.
Have you ever tried the model pre-trained on the VQA datasets? If so, could you please tell me its validation score? I am not sure if this problem is due to some bugs in my codes or the huggingface implementation. Thank you so much!
Ah! My apologies, I understand now. I don’t recall ever trying the checkpoint stored in HuggingFace pretrained on everything — your best bet is to open an issue in either Transformers directly, or Hao’s repo.
If you decide to open an issue on transformers, let me know, and I can see who’s able to take a look!
Thank you! Once I figure it out, I will let you know then:)
Hey, @siddk ! I found the problem!
In fact, there are two important factors in the re-implementation of LXMERT:
from transformers import LxmertConfig, LxmertTokenizerFast, LxmertForQuestionAnswering
from transformers import LxmertForQuestionAnswering
LxmertConfig.from_pretrained("unc-nlp/lxmert-base-uncased", cache_dir='data/LXMERT') lxmert_config = LxmertConfig.from_pretrained("unc-nlp/lxmert-base-uncased", cache_dir=config.lxmert_cache) lxrt = LxmertForQuestionAnswering.from_pretrained( None, config=lxmert_config, state_dict=torch.load(os.path.join("data/snap/pretrained", "Epoch19_LXRT.pth")) )
2. **The initialization of the answer-heads.** For more information about this, you could refer to [here](), which is written by Tao Hao.
When I addressed the above two problems, the validation score is drastically improved from **62% to 70%**.
Hope this helps!
Hi, there!
Thank you for much for sharing this outstanding work.
Following your codes, I first re-implement the LXMERT baseline these days. But my validation score of this model is only 62% (no-vqa pertaining and using all vqa-v2 training samples), which is relatively low compared with the standard repository. Besides, the training also takes more epochs (15ep) than the default.
Would you please share your validation score and training logs here? I want to figure out why this phenomenon. Thanks!