weibo和MSRS char baseline达不到论文中的值

liuwei1206 commented 6 years ago

我根据您的代码，仅仅用char embedding来复现基于char的weibo和MSRA实验，我发现webo和MSRA的结果都达不到论文中的引用值，weibo test只有: 0.475, 论文中是0.5277； MSRA test只有85.75，论文中是88.81。所以，我想请教一下作者，这大概是什么原因造成的？我调试了很久，但是始终没有太大的提升。

jiesutd commented 6 years ago

Please show me your log file.

liuwei1206 commented 6 years ago

CuDNN: True GPU available: True Status: train Seg: True Train file: data/weibo/train.char.bmes Dev file: data/weibo/dev.char.bmes Test file: data/weibo/test.char.bmes Raw file: None Char emb: data/gigaword_chn.all.a2b.uni.ite50.vec Bichar emb: None Gaz file: Model saved to: data/model/weibo/ build word pretrain emb... Embedding: pretrain word:11327, prefect match:3281, case_match:0, oov:76, oov%:0.022632519356759976 Training model... DATA SUMMARY START: Tag scheme: BMES MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Use bigram: False Word alphabet size: 3358 Biword alphabet size: 42650 Char alphabet size: 3357 Gaz alphabet size: 2 Label alphabet size: 29 Word embedding size: 50 Biword embedding size: 50 Char embedding size: 30 Gaz embedding size: 50 Norm word emb: True Norm biword emb: True Norm gaz emb: False Norm gaz dropout: 0.5 Train instance number: 1349 Dev instance number: 270 Test instance number: 270 Raw instance number: 0 Hyperpara iteration: 100 Hyperpara batch size: 1 Hyperpara lr: 0.015 Hyperpara lr_decay: 0.05 Hyperpara HP_clip: 5.0 Hyperpara momentum: 0 Hyperpara hidden_dim: 100 Hyperpara dropout: 0.5 Hyperpara lstm_layer: 1 Hyperpara bilstm: True Hyperpara GPU: True Hyperpara use_gaz: True Hyperpara fix gaz emb: False Hyperpara use_char: False DATA SUMMARY END.

事实上，我修改了您的代码，但是对于基于char的模型应该是没有影响的，我仅仅使用了您提供的char embedding作为输入，没有使用biword等信息！

jiesutd commented 6 years ago

Please show me your complete log file. Besides, please notice that the Weibo dataset has three types NE/NM/Overall. The 0.5277 is the result of Overall, please confirm you are using the right Weibo data.

BTW, the weibo dataset has two versions: initial version and the revised version. Please use the revised version.

liuwei1206 commented 6 years ago

In fact, that's all the log file, as I modify some of your code, the log file maybe smaller than the origin one. When doing the weibo experiment, I use the revised version, that is remove the position number in the chinese character. But I am not sure the data I use is same to yours, so I send a email to you! And the result I get is the overall result, I also change the schema from bio to bmes, But still can't achieve 0.5277. The final result is dev: 0.5439, test: 0.4752.

jiesutd commented 6 years ago

Please use my original code to run the baseline experiment. I can not debug your revised code based on this short log file.
The Weibo dataset can be downloaded directly from the author's Github. Due to copyright reason, I can not share you with the MSRA data.

liuwei1206 commented 6 years ago

Yes I will run the origin code to do the baseline experiment. Thank you for your continued help, I will try some other ways to get the data.

liuwei1206 commented 6 years ago

Hi, I use the origin code, but still can not achieve 0.5277. I only change the Lattice LSTM to pytorch's bilstm in bilstm.py. The log file in followings:

CuDNN: True GPU available: True Status: train Seg: True Train file: data/weibo/train.char.bmes Dev file: data/weibo/dev.char.bmes Test file: data/weibo/test.char.bmes Raw file: None Char emb: data/gigaword_chn.all.a2b.uni.ite50.vec Bichar emb: None Gaz file: data/ctb.50d.vec Model saved to: data/model/saved_model.lstmcrf Load gaz file: data/ctb.50d.vec total size: 704368 gaz alphabet size: 10798 gaz alphabet size: 12235 gaz alphabet size: 13671 build word pretrain emb... Embedding: pretrain word:11327, prefect match:3281, case_match:0, oov:76, oov%:0.022632519356759976 build biword pretrain emb... Embedding: pretrain word:0, prefect match:0, case_match:0, oov:42649, oov%:0.9999765533411489 build gaz pretrain emb... Embedding: pretrain word:704368, prefect match:13669, case_match:0, oov:1, oov%:7.31475385853266e-05 Training model... DATA SUMMARY START: Tag scheme: BMES MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Use bigram: False Word alphabet size: 3358 Biword alphabet size: 42650 Char alphabet size: 3357 Gaz alphabet size: 13671 Label alphabet size: 29 Word embedding size: 50 Biword embedding size: 50 Char embedding size: 30 Gaz embedding size: 50 Norm word emb: True Norm biword emb: True Norm gaz emb: False Norm gaz dropout: 0.5 Train instance number: 1350 Dev instance number: 270 Test instance number: 270 Raw instance number: 0 Hyperpara iteration: 100 Hyperpara batch size: 8 Hyperpara lr: 0.015 Hyperpara lr_decay: 0.05 Hyperpara HP_clip: 5.0 Hyperpara momentum: 0 Hyperpara hidden_dim: 200 Hyperpara dropout: 0.5 Hyperpara lstm_layer: 1 Hyperpara bilstm: True Hyperpara GPU: True Hyperpara use_gaz: False Hyperpara fix gaz emb: False Hyperpara use_char: False DATA SUMMARY END. Data setting saved to file: data/model/saved_model.lstmcrf.dset build batched lstmcrf... build batched bilstm... build batched crf... finished built model.

But the result is better than the modified code, now is 0.4970.

jiesutd commented 6 years ago

The Weibo data is too small, you can use different random seeds. Otherwise, you can run experiment on other big datasets.
Do not change the code, just use the lattice lstm and set the gaz directory as None, this is the baseline.

liuwei1206 commented 6 years ago

you say the data is too small, it that mean my data different from yours or mean that the data is really so need some different random seeds!? And Do I need to set the bigram=False ? I only want to do some baseline experiments rely on the char embedding.

liuwei1206 commented 6 years ago

I achieve 0.5277， the sulution is setting the batch_size=1, I think this maybe useful for someone else. Thank you for your patience, really thank you!

jiesutd commented 6 years ago

Great! Congratulations and thank you for your sharing.

CM-BF commented 5 years ago

I achieve 0.5277， the sulution is setting the batch_size=1, I think this maybe useful for someone else. Thank you for your patience, really thank you!

请问您改了什么把batch_size设成1，我同样遇到了这个问题，但是您一开始给予的log中的batch_size就是1，而后您不知道做了什么改动，而把use_gaz=false 并且 batch_size=1. 最后，您的解决方案是把batch_size 设为 1，这不就和您一开始的参数设置是一样的吗？我更改了一点点代码，仅仅为了适配pytorch0.3.1与0.3.0的不同。请问可以指教一下您的解决方案的细节吗？

yqqqqqq commented 5 years ago

Hi, I use the origin code, but still can not achieve 0.5277. I only change the Lattice LSTM to pytorch's bilstm in bilstm.py. The log file in followings: CuDNN: True GPU available: True Status: train Seg: True Train file: data/weibo/train.char.bmes Dev file: data/weibo/dev.char.bmes Test file: data/weibo/test.char.bmes Raw file: None Char emb: data/gigaword_chn.all.a2b.uni.ite50.vec Bichar emb: None Gaz file: data/ctb.50d.vec Model saved to: data/model/saved_model.lstmcrf Load gaz file: data/ctb.50d.vec total size: 704368 gaz alphabet size: 10798 gaz alphabet size: 12235 gaz alphabet size: 13671 build word pretrain emb... Embedding: pretrain word:11327, prefect match:3281, case_match:0, oov:76, oov%:0.022632519356759976 build biword pretrain emb... Embedding: pretrain word:0, prefect match:0, case_match:0, oov:42649, oov%:0.9999765533411489 build gaz pretrain emb... Embedding: pretrain word:704368, prefect match:13669, case_match:0, oov:1, oov%:7.31475385853266e-05 Training model... DATA SUMMARY START: Tag scheme: BMES MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Use bigram: False Word alphabet size: 3358 Biword alphabet size: 42650 Char alphabet size: 3357 Gaz alphabet size: 13671 Label alphabet size: 29 Word embedding size: 50 Biword embedding size: 50 Char embedding size: 30 Gaz embedding size: 50 Norm word emb: True Norm biword emb: True Norm gaz emb: False Norm gaz dropout: 0.5 Train instance number: 1350 Dev instance number: 270 Test instance number: 270 Raw instance number: 0 Hyperpara iteration: 100 Hyperpara batch size: 8 Hyperpara lr: 0.015 Hyperpara lr_decay: 0.05 Hyperpara HP_clip: 5.0 Hyperpara momentum: 0 Hyperpara hidden_dim: 200 Hyperpara dropout: 0.5 Hyperpara lstm_layer: 1 Hyperpara bilstm: True Hyperpara GPU: True Hyperpara use_gaz: False Hyperpara fix gaz emb: False Hyperpara use_char: False DATA SUMMARY END. Data setting saved to file: data/model/saved_model.lstmcrf.dset build batched lstmcrf... build batched bilstm... build batched crf... finished built model. But the result is better than the modified code, now is 0.4970.

Hello!我用在网上的weiboNER数据集，但是跑出来的结果，在oov上就跟你不一样，可能是我找的数据集不太对，而且我找的那份标注用的是BIO，而你的是BMES，你可以分享一份给我吗？我的邮箱2431225222@qq.com 差异如下： Load gaz file: data/ctb.50d.vec total size: 704368 gaz alphabet size: 10798 gaz alphabet size: 12235 gaz alphabet size: 13671 build word pretrain emb... Embedding: pretrain word:11327, prefect match:3281, case_match:0, oov:90, oov%:0.0266903914591 build biword pretrain emb... Embedding: pretrain word:0, prefect match:0, case_match:0, oov:42673, oov%:0.999976566528 build gaz pretrain emb... Embedding: pretrain word:704368, prefect match:13669, case_match:0, oov:1, oov%:7.31475385853e-05 Training model... 而且，最后得出来的F1值只有0.4，数据集这点差异，就导致结果相差这么大吗？

jiesutd commented 5 years ago

@yqqqqqq For weibo dataset, there are old dataset and updated dataset. Please use the updated version. Details here: https://www.cs.jhu.edu/~npeng/papers/golden_horse_supplement.pdf

You can convert your BIO->BMES using this script: https://github.com/jiesutd/NCRFpp/blob/master/utils/tagSchemeConverter.py

jiesutd / LatticeLSTM

weibo和MSRS char baseline达不到论文中的值 #35