jiesutd / NCRFpp

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Apache License 2.0
1.88k stars 447 forks source link

ResumeNER数据集上复现不了结果 #113

Closed shuizhonghaitong closed 5 years ago

shuizhonghaitong commented 5 years ago

您好!我在您的Chinese NER Using Lattice LSTM论文的resume数据集上进行实验,只修改了配置文件,训练了100个epoch,但是训练过程中每次在dev和test上评测的f1值都是10%~20%,十分不理想。配置文件如下:

use # to comment out the configure item

I/O

train_dir=data/ResumeNER/train.char.bmes dev_dir=data/ResumeNER/dev.char.bmes test_dir=data/ResumeNER/test.char.bmes model_dir=result/lstmcrf word_emb_dir=data/gigaword_chn.all.a2b.uni.ite50.vec.txt

raw_dir=

decode_dir=

dset_dir=

load_model_dir=

char_emb_dir=

norm_word_emb=True norm_char_emb=False number_normalized=True seg=True word_emb_dim=50 char_emb_dim=30

NetworkConfiguration

use_crf=True use_char=False word_seq_feature=LSTM char_seq_feature=CNN

feature=[POS] emb_size=20

feature=[Cap] emb_size=20

nbest=1

TrainingSetting

status=train optimizer=SGD iteration=100 batch_size=10 ave_batch_loss=False

Hyperparameters

cnn_layer=4 char_hidden_dim=50 hidden_dim=200 dropout=0.5 lstm_layer=1 bilstm=True learning_rate=0.015 lr_decay=0.05 momentum=0 l2=1e-8

gpu

clip=

请求您的帮助~~十分感谢!

jiesutd commented 5 years ago

找到原因了么?

shuizhonghaitong commented 5 years ago

找到原因了是因为gigaword_chn.all.a2b.uni.ite50.vec.txt没有去掉首行(11327 50)导致的。。不好意思打扰您了~

------------------ 原始邮件 ------------------ 发件人: "Jie Yang"notifications@github.com; 发送时间: 2019年5月8日(星期三) 晚上9:46 收件人: "jiesutd/NCRFpp"NCRFpp@noreply.github.com; 抄送: "刘昱彤"443072618@qq.com; "State change"state_change@noreply.github.com; 主题: Re: [jiesutd/NCRFpp] ResumeNER数据集上复现不了结果 (#113)

找到原因了么?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or mute the thread.