dsindex / ntagger

reference pytorch code for named entity tagging
86 stars 13 forks source link

Wrong results while training the Model #3

Closed geo47 closed 3 years ago

geo47 commented 3 years ago

Hello,

The BERT-BiLST-CRF model runs about 28 epochs (including multiple patience epochs) and stopped at the best F1 value = 0.760492. While your results show it's score more than 90.

INFO:__main__:EarlyStopping Status: _step / patience = 7 / 7, value = 0.760492

I wonder if I am missing parameters...

--batch_size
16
--eval_batch_size
32
--save_path=pytorch-model-bert-en.pt
--bert_output_dir=bert-checkpoint-en
--epoch=30
--bert_use_pos
--bert_use_feature_based
--use_crf

Using default bert-config for preprocess data and training.

bert-config-json:

{
    "emb_class": "bert",
    "enc_class": "bilstm",
    "n_ctx": 180,
    "pad_token": "<pad>",
    "pad_token_id": 0,
    "unk_token": "<unk>",
    "unk_token_id": 1,
    "dsa_num_attentions": 4,
    "dsa_dim": 300,
    "dsa_r": 2,
    "pos_emb_dim": 100,
    "pad_pos": "<pad>",
    "pad_pos_id": 0,
    "dropout": 0.1,
    "lstm_hidden_dim": 200,
    "lstm_num_layers": 2,
    "lstm_dropout": 0.0,
    "mha_num_attentions": 8,
    "pad_label": "<pad>",
    "pad_label_id": 0,
    "default_label": "O"
}
dsindex commented 3 years ago

@geo47

did you use bert-base-cased or bert-large-cased? i had the experiments w/ bellow settings. especially larger learning rate, longer epoch.

$ python preprocess.py --config=configs/config-bert.json --data_dir=data/conll2003 --bert_model_name_or_path=./embeddings/bert-large-cased

$ python train.py --config=configs/config-bert.json --data_dir=data/conll2003 --save_path=pytorch-model-bert.pt --bert_model_name_or_path=./embeddings/bert-large-cased --bert_output_dir=bert-checkpoint --batch_size=16 --lr=3e-4 --epoch=64 --use_crf --bert_use_feature_based
스크린샷 2021-02-22 오전 11 53 51
geo47 commented 3 years ago

@dsindex

Oh sorry I did not mention this:

--data_dir=data/conll2003 --bert_model_name_or_path=bert-base-cased

But even with bert-base-cased, you mentioned your results like this:

BERT-base(cased), BiLSTM-CRF | 90.17 |   | word | 43.4804 /

image

Thanks for your help

geo47 commented 3 years ago

@dsindex

Oh sorry I did not mention this:

--data_dir=data/conll2003 --bert_model_name_or_path=bert-base-cased

But even with bert-base-cased, you mentioned your results like this:

BERT-base(cased), BiLSTM-CRF | 90.17 |   | word | 43.4804 /

image

Thanks for your help

I am sorry, I think, I made some mistake. So please hold on, I am training it again and let you know :-)

geo47 commented 3 years ago

Hello @dsindex

Alright, I got the results correct :-)

BERT-base(cased), BiLSTM-CRF || value = 0.913481 || word / pos || epoch=30 / bert_use_feature_based

Thanks :-)