airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
923 stars 157 forks source link

Bad performance of NLVR2. #84

Closed yangxuntu closed 3 years ago

yangxuntu commented 3 years ago

Hi, I also met the problem in https://github.com/airsplay/lxmert/issues/1 and I also only have the performance to be about 50: Epoch 0: Train 50.31 Epoch 0: Valid 50.86 Epoch 0: Best 50.86

Epoch 1: Train 50.39 Epoch 1: Valid 49.14 Epoch 1: Best 50.86

Epoch 2: Train 50.44 Epoch 2: Valid 49.14 Epoch 2: Best 50.86

Epoch 3: Train 50.57 Epoch 3: Valid 50.86 Epoch 3: Best 50.86 I also tried torch == 1.0.1, but it still did not work. I also wanted to download the data in that link, while the link seems did not exist. Can you reload these features again? Thank you very much!

airsplay commented 3 years ago

Which feature do you mean by the "link"? The default feature link wget --no-check-certificate https://nlp1.cs.unc.edu/data/lxmert_data/nlvr2_imgfeat/train_obj36.zip -P data/nlvr2_imgfeat works well on my side.

yangxuntu commented 3 years ago

The link for raw feature, which you provide in https://github.com/airsplay/lxmert/issues/1. Also, I am downloading the features from google drive, I hope these features will work. But it is really weird that I have these strange performances. Should I use exactly the same environment as yours? Do you have any anaconda environment?

airsplay commented 3 years ago

To the past of my experience, the most possible reason is that the pre-trained model is not loaded correctly.

Could you check whether you have downloaded the pre-trained model and place it in the correct location?

yangxuntu commented 3 years ago

Ok. I will check it. Thank you very much!

yangxuntu commented 3 years ago

It's my problem, I did not correctly read all the parameters from the pretrained model. The original code is correct.

haoopan commented 3 years ago

I also meet this problem, could you tell me in detail? thank you !

haoopan commented 3 years ago

Hi, sorry to disturb you, I have a question when running your code on nlvr. When I remove the pre-trained model to train nlvr, the result is : Epoch 0: Train 50.31 Epoch 0: Valid 50.86 Epoch 0: Best 50.86

Epoch 1: Train 50.39 Epoch 1: Valid 49.14 Epoch 1: Best 50.86

Epoch 2: Train 50.44 Epoch 2: Valid 49.14 Epoch 2: Best 50.86

Epoch 3: Train 50.57 Epoch 3: Valid 50.86 Epoch 3: Best 50.86

So, how how can I train nlvr without pre-trained model? Could you please reply more quickly? I'm in a hurry!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

yangxuntu commented 3 years ago

In my case, I revised some parts of the original lxmert code to my version. Then I find that the model can only achieve 50 accuracy because the load function in nlvr2.py neglects the effect of `.module' in key.


发件人: haoopan @.> 发送时间: 2021年3月21日 17:22 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84)

Hi, sorry to disturb you, I have a question when running your code on nlvr. When I remove the pre-trained model to train nlvr, the result is : Epoch 0: Train 50.31 Epoch 0: Valid 50.86 Epoch 0: Best 50.86

Epoch 1: Train 50.39 Epoch 1: Valid 49.14 Epoch 1: Best 50.86

Epoch 2: Train 50.44 Epoch 2: Valid 49.14 Epoch 2: Best 50.86

Epoch 3: Train 50.57 Epoch 3: Valid 50.86 Epoch 3: Best 50.86

So, how how can I train nlvr without pre-trained model? Could you please reply more quickly? I'm in a hurry!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/airsplay/lxmert/issues/84#issuecomment-803539940, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOSKBDEN5SLEKEAOJE3TEW3EBANCNFSM4SINMY5A.

yangxuntu commented 3 years ago

This is my code, but I do not know whether it is suitable to your case.


发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84)

I also meet this problem, could you tell me in detail? thank you !

― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/airsplay/lxmert/issues/84#issuecomment-801753069, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A.

haoopan commented 3 years ago

This is my code, but I do not know whether it is suitable to your case. ____ 发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84) I also meet this problem, could you tell me in detail? thank you ! ― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A.

In my case, I didn't change any code, just didn't load the pre-trained model, because I wanted to retrain the entire model without the pre-trained model.

haoopan commented 3 years ago

When I load the pre-trained model, I get normal results.Is this because of some initialization problem? I'm in a hurry!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

yangxuntu commented 3 years ago

Without loading the pre-trained model, it is normal to get about 50 accuracy.


发件人: haoopan @.> 发送时间: 2021年3月21日 20:02 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84)

This is my code, but I do not know whether it is suitable to your case. … ____ 发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84https://github.com/airsplay/lxmert/issues/84) I also meet this problem, could you tell me in detail? thank you ! D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)https://github.com/airsplay/lxmert/issues/84#issuecomment-801753069>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A.

In my case, I didn't change any code, just didn't load the pre-trained model, because I wanted to retrain the entire model without the pre-trained model.

― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/airsplay/lxmert/issues/84#issuecomment-803565006, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOT5GG2HXTN3DZLLRLLTEXN4PANCNFSM4SINMY5A.

haoopan commented 3 years ago

Without loading the pre-trained model, it is normal to get about 50 accuracy. ____ 发件人: haoopan @.> 发送时间: 2021年3月21日 20:02 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84) This is my code, but I do not know whether it is suitable to your case. … ____ 发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84<#84>) I also meet this problem, could you tell me in detail? thank you ! D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)<#84 (comment)>>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A. In my case, I didn't change any code, just didn't load the pre-trained model, because I wanted to retrain the entire model without the pre-trained model. ― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOT5GG2HXTN3DZLLRLLTEXN4PANCNFSM4SINMY5A.

Can't I train from scratch without loading a pre-training model?

yangxuntu commented 3 years ago

You need to load the pre-trained model or you can not get 74 accuracy


发件人: haoopan @.> 发送时间: 2021年3月21日 20:10 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84)

Without loading the pre-trained model, it is normal to get about 50 accuracy. … ____ 发件人: haoopan @.> 发送时间: 2021年3月21日 20:02 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84https://github.com/airsplay/lxmert/issues/84) This is my code, but I do not know whether it is suitable to your case. … ____ 发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84https://github.com/airsplay/lxmert/issues/84<#84https://github.com/airsplay/lxmert/issues/84>) I also meet this problem, could you tell me in detail? thank you ! D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84https://github.com/airsplay/lxmert/issues/84 (comment)<#84 (comment)https://github.com/airsplay/lxmert/issues/84#issuecomment-801753069>>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A. In my case, I didn't change any code, just didn't load the pre-trained model, because I wanted to retrain the entire model without the pre-trained model. D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)https://github.com/airsplay/lxmert/issues/84#issuecomment-803565006>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOT5GG2HXTN3DZLLRLLTEXN4PANCNFSM4SINMY5A.

Can't I train from scratch without loading a pre-training model?

― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/airsplay/lxmert/issues/84#issuecomment-803566393, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUORDGGCZY72PA72LYU3TEXO3BANCNFSM4SINMY5A.

haoopan commented 3 years ago

You need to load the pre-trained model or you can not get 74 accuracy ____ 发件人: haoopan @.> 发送时间: 2021年3月21日 20:10 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84) Without loading the pre-trained model, it is normal to get about 50 accuracy. … ____ 发件人: haoopan @.> 发送时间: 2021年3月21日 20:02 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84<#84>) This is my code, but I do not know whether it is suitable to your case. … ____ 发件人: haoopan @.> 发送时间: 2021年3月18日 17:03 收件人: airsplay/lxmert @.> 抄送: #YANG XU# @.>; State change @.> 主题: Re: [airsplay/lxmert] Bad performance of NLVR2. (#84<#84><#84https://github.com/airsplay/lxmert/issues/84>) I also meet this problem, could you tell me in detail? thank you ! D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84<#84> (comment)<#84 (comment)<#84 (comment)>>>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOS4VI5AHSAFIIEUZV3TEG6VPANCNFSM4SINMY5A. In my case, I didn't change any code, just didn't load the pre-trained model, because I wanted to retrain the entire model without the pre-trained model. D You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)<#84 (comment)>>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUOT5GG2HXTN3DZLLRLLTEXN4PANCNFSM4SINMY5A. Can't I train from scratch without loading a pre-training model? ― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#84 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJEJUORDGGCZY72PA72LYU3TEXO3BANCNFSM4SINMY5A.

Thanks for your reply. But why does it stay at 50.86? That's weird.

haoopan commented 3 years ago

It's like an untrained guess

haoopan commented 3 years ago

And when I changed the model to my own, the result was 50.86 all the time

haoopan commented 3 years ago

Strangely enough, when I trained the VQA there was no pre-trained model, it could be trained and improved