The accuracy of bert_spc.py is low

YHTtTtao commented 5 years ago

I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?

songyouwei commented 5 years ago

Note that BERT is very sensitive to hyperparameters on small data sets.

My own experiment shows that the learning rate of 5e-5, 3e-5, 2e-5 performs excellent. BERT-SPC on Restaurant dataset: Acc==0.8446, F1==0.7698

songyouwei commented 5 years ago

plz check the latest committed version

YHTtTtao commented 5 years ago

Ok, thank you, I will try that again.

floAlpha commented 5 years ago

666, 最终还是被大佬解决了

fhamborg commented 5 years ago

@songyouwei What are the parameters, with which you achieve accuracy values of over 0.80? I'm using ABSA and aen_bert, with default parameters (which are defined in the argparser in train.py) but the accuracy goes only up to 0.59, where it converges. These are the parameters I use:

> training arguments:
>>> model_name: aen_bert
>>> dataset: restaurant
>>> optimizer: <class 'torch.optim.adam.Adam'>
>>> initializer: <function xavier_uniform_ at 0x7f4f29cca7a0>
>>> learning_rate: 2e-05
>>> dropout: 0.1
>>> l2reg: 0.01
>>> num_epoch: 10
>>> batch_size: 16
>>> log_step: 5
>>> embed_dim: 300
>>> hidden_dim: 300
>>> bert_dim: 768
>>> pretrained_bert_name: bert-base-uncased
>>> max_seq_len: 80
>>> polarities_dim: 3
>>> hops: 3
>>> device: cpu
>>> seed: None
>>> valset_ratio: 0
>>> local_context_focus: cdm
>>> SRD: 3
>>> model_class: <class 'models.aen.AEN_BERT'>
>>> dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
>>> inputs_cols: ['text_raw_bert_indices', 'aspect_bert_indices']

songyouwei commented 5 years ago

@fhamborg Maybe it's because the batch size is small? What is the performance of this set of parameters on bert_spc ? I'll check it later.

fhamborg commented 5 years ago

Hi, thanks for getting back! With the same parameters, but on bert_spc the performance I get on the restaurants set is a bit higher than for aen_bert but still not in the 80%ish range:

> val_acc: 0.6455, val_f1: 0.4145
>> test_acc: 0.6670, test_f1: 0.3756

FYI, I attached the full console output: https://gist.github.com/fhamborg/dade525af54a158982967383444fade4

yangheng95 commented 5 years ago

Hello, I just cloned the latest repository and checked the code after latest PR . I set the parameters consistent with you and aen_bert's accuracy on restaurant achieves 81+. Here is my training log. training log.txt

yangheng95 commented 5 years ago

Hi guys, I've also tested all of the Bert-based models modified by my latest PR, and here are the logs, and it's working really well. I hope it helps. bert_spc training log.txt lcf_bert training log.txt

fhamborg commented 5 years ago

Hey @yangheng95 , thanks for the logs! I still haven't figured what exactly the difference; but the only more or less meaningful assumption is due to the random initialization of a few components in pytorch and transformers. Would you be so kind to post your bert_spc log with the same parameters as before, but also setting the --seed 1337? This would allow me a better comparison. Thank you =)

fhamborg commented 5 years ago

Also, could you post the log when running aen_glove? Thank you very much in advance!

yangheng95 commented 5 years ago

Hello @fhamborg , I trained the bert_spc model with 1337 as seed and the result is still very good. >> test_acc: 0.8402, test_f1: 0.7692. I think that cloning and referring to the latest code after PR may solve your problem. Due to the busy schedule, I may not have time to adapt and train AEN-GloVe, but you can run it by adding the aen-glove model to train.py just as adding the other models. bert_spc.training.log.seed1337.txt

fhamborg commented 5 years ago

Hi @yangheng95 , thanks for your reply and verification with seed 1337. I'm using the latest repo, i.e., including the PR to migrate to transformers. Though, I tried it on another machine and the results went up to roughly 70% (plus/minus) for all the approaches.

Also, I managed to train aen_glove but in contrast to the results reported in the paper, I was only able to get roughly 50% on validation or test set. Do you have any idea where the difference for glove could come come from?

songyouwei commented 5 years ago

@fhamborg Thank you for reporting this issue. I just looked into this. There might be something wrong with the recent release of the pretrained bert from https://github.com/huggingface/transformers, named transformers.

I installed it with pip install transformers, replaced pytorch_transformers imports with transformers, and reproduced this issue.

Try reinstall and use the previous release pytorch_transformers with pip install pytorch-transformers.

fhamborg commented 5 years ago

Thanks, you're right, I was using transformers instead of pytorch_transformers. I will check it out now :-)

fhamborg commented 5 years ago

Awesome, on pytorch_transformers I get much higher performances than on transformers, e.g.:

> val_acc: 0.8536, val_f1: 0.7924

Thanks for the hint, @songyouwei ! Do you have any idea what might be causing this significant difference between whether pytorch_transformers or transformers is being used?

shuzhinian commented 9 months ago

hello @songyouwei ,how to train aen_glove,do I need to modify the code for the train?

songyouwei / ABSA-PyTorch

The accuracy of bert_spc.py is low #27