The hyper-parameters for reproducing results for Conll2003 and OntoNotes 5

syuoni / eznlp

Easy Natural Language Processing

Apache License 2.0

130 stars 21 forks source link

The hyper-parameters for reproducing results for Conll2003 and OntoNotes 5 #26

Closed yhcc closed 2 years ago

yhcc commented 2 years ago

Hi, after reading your paper "Boundary Smoothing for Named Entity Recognition", I was glad to see that the performance was further promoted by your method. I want to reproduce these results, is there any detailed introduction to rerun the experiments by the eznlp framework, especially the specific hyper-parameters used.

syuoni commented 2 years ago

Hi,

To reproduce our results, you may first setup the environment following README and this link, and then run:

$ python scripts/entity_recognition.py @scripts/options/with_bert.opt \
    --dataset conll2003 \ 
    --doc_level \
    --num_epochs 50 \
    --lr 2e-3 \
    --finetune_lr 2e-5 \
    --batch_size 48 \
    --ck_decoder boundary_selection \
    --sb_epsilon 0.3 \
    --sb_size 1 \
    --bert_drop_rate 0.2 \
    --use_interm2 \
    --bert_arch RoBERTa_base

You may change the sb_epsilon to 0.2 for OntoNotes 5, and the other hyper-parameters are the same.

You will easily get F1-score of 93.5+ on CoNLL 2003, and 91.5+ on OntoNotes 5.

Further improvements may be achieved by using smaller batch sizes or tuning the learning rates.

yhcc commented 2 years ago

May I ask what is the gpu used to run this parameter?

syuoni commented 2 years ago

Sure. We used V100.

If your device does not have large memory, you may use gradient accumulation to simulate the batch size, e.g.,

$ python scripts/entity_recognition.py @scripts/options/with_bert.opt \
    --dataset conll2003 \ 
    --batch_size 4 \
    --num_grad_acc_steps 12 \
    [other options]

ldsheng1998 commented 11 months ago

Sure. We used V100.

If your device does not have large memory, you may use gradient accumulation to simulate the batch size, e.g.,
$ python scripts/entity_recognition.py @scripts/options/with_bert.opt \
    --dataset conll2003 \ 
    --batch_size 4 \
    --num_grad_acc_steps 12 \
    [other options]

你好，我目前在CoNLL 2003上进行尝试。请问使用--doc_level参数，将sequence合并变成文档级别的sequence，你所设置的文档级别的max_len是什么？我尝试了一下将max_len设置为510，然后其他参数按照您上面给出的参数配置，验证时只在test_data上得到了92.95%的F1分数，我想这可能与max_len的大小有关。