cooelf / SemBERT

Semantics-aware BERT for Language Understanding (AAAI 2020)
https://arxiv.org/abs/1909.02209
MIT License
286 stars 55 forks source link

Cannot re-produce the result on SNLI #28

Closed ZacharyChenpk closed 2 years ago

ZacharyChenpk commented 2 years ago

I have downloaded the pre-trained model from https://drive.google.com/open?id=1Yn-WCw1RaMxbDDNZRnoJCIGxMSAOu20_ and https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.05.25.tar.gz accessed here, and tried to re-produce the experiment on SNLI dataset with the following environment settings: python 3.6 allennlp 0.8.1 torch 1.8.0+cu111

When the evaluation finished, I found that the test accuracy is only 0.8563 (= 8412 / 9824) and dev acc is 0.8557, which are far lower than the results reported in the paper. Though the different module versions might lead to performance decreasing, is the drop of about 0.06 reasonable?

ZacharyChenpk commented 2 years ago

Or is there any available trained model for higher pytorch/cuda version? My GPU is with higher driver version but small memory, which makes it impossible to use torch 1.0.0, and hard to train a new model from scratch. Maybe using a model trained on correct pytorch/cuda version can solve it?

cooelf commented 2 years ago

Sorry for a late reply. This might be because of the version mismatch as the repo has continuously updated after the model release.

Could you replace the L1024 with L1024 in the modeling.py file and try again?

https://github.com/cooelf/SemBERT/blob/master/pytorch_pretrained_bert/modeling.py#L1024

https://github.com/cooelf/SemBERT/blob/master/pytorch_pretrained_bert/modeling.py#L1023