Closed youcaiSUN closed 4 years ago
Hi ,
Thanks for pointing that out. We just realized that we made a little mistake during cleaning up our code. You are not getting good reason because Bert is not loading pre-trained weights. We are extremely busy with a submission now. Allow us 2 more weeks. We will fix it and send an email. Apologize for the mistake. In the meantime, if you want to fix it quickly follow the procedure: bert.py
xlnet.py change line 423 in class MAG_XLNetForSequenceClassification( XLNetPreTrainedModel): change self.multimodal_transformer = MAG_XLNetModel(config, multimodal_config) to self.transformer = MAG_XLNetModel(config, multimodal_config)
Also fix the line 476 in forward function
I hope it will fix your problem. We will update the branch within two weeks.
Thanks Kamrul
On Sun, Sep 27, 2020, 10:33 AM youcaiSUN notifications@github.com wrote:
Hi Wasifur,
Thanks for your sharing! I ran your code in the default hyparameter setting (e.g., n_epochs=40, train_batch_size=48, learning_rate=1e-5, beta_shift=1.0), however, I couldn't reproduce the results in the paper. I got the binary accuracy of 0.696 when using bert and 0.715 when using xlnet. Considering the big gap between my results and yours, I believe that I missed something important for the finetuning of multimodal bert. Could you help me to figure out this? Below are the running logs.
- bert (seed=3931) epoch:39, train_loss:1.14606777826945, valid_loss:2.588425040245056, test_acc:0.6961832061068702
- xlnet (seed=9733) epoch:39, train_loss:0.7299335289884497, valid_loss:2.482449531555176, test_acc:0.7145038167938931
Thanks very much!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WasifurRahman/BERT_multimodal_transformer/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARUR5GM572RHMHV2TFJLWLSH5EL3ANCNFSM4R3S3BKA .
It really works, thank you very much!
Closed this issue with last pull request, if any problem persists, feel free to re-open the issue.
Hi Wasifur,
Thanks for your sharing! I ran your code in the default hyparameter setting (e.g., n_epochs=40, train_batch_size=48, learning_rate=1e-5, beta_shift=1.0), however, I couldn't reproduce the results in the paper. I got the binary accuracy of 0.696 when using bert and 0.715 when using xlnet. Considering the big gap between my results and yours, I believe that I missed something important for the finetuning of multimodal bert. Could you help me to figure out this? Below are the running logs. 1) bert (seed=3931) epoch:39, train_loss:1.14606777826945, valid_loss:2.588425040245056, test_acc:0.6961832061068702 2) xlnet (seed=9733) epoch:39, train_loss:0.7299335289884497, valid_loss:2.482449531555176, test_acc:0.7145038167938931
Thanks very much!