Closed YHTtTtao closed 5 years ago
Note that BERT is very sensitive to hyperparameters on small data sets.
My own experiment shows that the learning rate of 5e-5, 3e-5, 2e-5 performs excellent. BERT-SPC on Restaurant dataset: Acc==0.8446, F1==0.7698
plz check the latest committed version
Ok, thank you, I will try that again.
666, 最终还是被大佬解决了
@songyouwei What are the parameters, with which you achieve accuracy values of over 0.80? I'm using ABSA and aen_bert, with default parameters (which are defined in the argparser in train.py
) but the accuracy goes only up to 0.59, where it converges. These are the parameters I use:
> training arguments:
>>> model_name: aen_bert
>>> dataset: restaurant
>>> optimizer: <class 'torch.optim.adam.Adam'>
>>> initializer: <function xavier_uniform_ at 0x7f4f29cca7a0>
>>> learning_rate: 2e-05
>>> dropout: 0.1
>>> l2reg: 0.01
>>> num_epoch: 10
>>> batch_size: 16
>>> log_step: 5
>>> embed_dim: 300
>>> hidden_dim: 300
>>> bert_dim: 768
>>> pretrained_bert_name: bert-base-uncased
>>> max_seq_len: 80
>>> polarities_dim: 3
>>> hops: 3
>>> device: cpu
>>> seed: None
>>> valset_ratio: 0
>>> local_context_focus: cdm
>>> SRD: 3
>>> model_class: <class 'models.aen.AEN_BERT'>
>>> dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
>>> inputs_cols: ['text_raw_bert_indices', 'aspect_bert_indices']
@fhamborg Maybe it's because the batch size is small? What is the performance of this set of parameters on bert_spc
? I'll check it later.
Hi, thanks for getting back! With the same parameters, but on bert_spc
the performance I get on the restaurants set is a bit higher than for aen_bert
but still not in the 80%ish range:
> val_acc: 0.6455, val_f1: 0.4145
>> test_acc: 0.6670, test_f1: 0.3756
FYI, I attached the full console output: https://gist.github.com/fhamborg/dade525af54a158982967383444fade4
Hello, I just cloned the latest repository and checked the code after latest PR . I set the parameters consistent with you and aen_bert's accuracy on restaurant achieves 81+. Here is my training log. training log.txt
Hi guys, I've also tested all of the Bert-based models modified by my latest PR, and here are the logs, and it's working really well. I hope it helps. bert_spc training log.txt lcf_bert training log.txt
Hey @yangheng95 , thanks for the logs! I still haven't figured what exactly the difference; but the only more or less meaningful assumption is due to the random initialization of a few components in pytorch and transformers. Would you be so kind to post your bert_spc
log with the same parameters as before, but also setting the --seed 1337
? This would allow me a better comparison. Thank you =)
Also, could you post the log when running aen_glove
? Thank you very much in advance!
Hello @fhamborg , I trained the bert_spc model with 1337 as seed and the result is still very good. >> test_acc: 0.8402, test_f1: 0.7692
. I think that cloning and referring to the latest code after PR may solve your problem. Due to the busy schedule, I may not have time to adapt and train AEN-GloVe, but you can run it by adding the aen-glove model to train.py just as adding the other models.
bert_spc.training.log.seed1337.txt
Hi @yangheng95 , thanks for your reply and verification with seed 1337. I'm using the latest repo, i.e., including the PR to migrate to transformers
. Though, I tried it on another machine and the results went up to roughly 70%
(plus/minus) for all the approaches.
Also, I managed to train aen_glove
but in contrast to the results reported in the paper, I was only able to get roughly 50%
on validation or test set. Do you have any idea where the difference for glove could come come from?
@fhamborg Thank you for reporting this issue.
I just looked into this.
There might be something wrong with the recent release of the pretrained bert from https://github.com/huggingface/transformers, named transformers
.
I installed it with pip install transformers
, replaced pytorch_transformers
imports with transformers
, and reproduced this issue.
Try reinstall and use the previous release pytorch_transformers
with pip install pytorch-transformers
.
Thanks, you're right, I was using transformers instead of pytorch_transformers. I will check it out now :-)
Awesome, on pytorch_transformers
I get much higher performances than on transformers
, e.g.:
> val_acc: 0.8536, val_f1: 0.7924
Thanks for the hint, @songyouwei ! Do you have any idea what might be causing this significant difference between whether pytorch_transformers
or transformers
is being used?
hello @songyouwei ,how to train aen_glove,do I need to modify the code for the train?
I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?