jiesutd / LatticeLSTM

Chinese NER using Lattice LSTM. Code for ACL 2018 paper.
1.79k stars 457 forks source link

多次尝试后weibo仍达不到论文中结果 #84

Closed gloria0108 closed 5 years ago

gloria0108 commented 5 years ago

作者您好: 我用weibo数据集(BMEOS模式)在您的原始代码(参数等均未改动)上跑了lattice model的实验,在NE,NM,ALL测试集上的F值都没有达到您论文汇报的结果(分别为:51.77vs53.04,60.00vs62.25,56.00vs58.79)。 我查看了之前的issues,看到您回复由于weibo数据规模小,所以存在不稳定的问题。因此我又尝试了8个不同的random seed,在weibo的all数据集上跑了实验,发现在weibo.all的test上最好的结果能达到57.03,仍然低于您论文汇报的结果58.79。请问我还可以进行哪些尝试以达到您论文汇报的结果?您论文的结果使用的是代码里的参数吗,我是否还需要调参?非常感谢!

jiesutd commented 5 years ago

你好,理论上参数未改变的话多试几个种子应该可以复现的。有同学独立重现的结果 参见https://github.com/jiesutd/LatticeLSTM/issues/35 注意weibo的数据有两个版本,一个旧的版本和一个更新的版本,请确认使用的是更新后的数据版本。

jiesutd commented 5 years ago

我发现上面的链接是重现baseline的。 请将你的代码运行log发我一份。我来看看有没有什么问题

gloria0108 commented 5 years ago

log如下所示,谢谢。

CuDNN: True GPU available: True Status: train Seg: True Train file: data-NER/WeiboNER/train.all.bmes Dev file: data-NER/WeiboNER/dev.all.bmes Test file: data-NER/WeiboNER/test.all.bmes Raw file: None Char emb: data/gigaword_chn.all.a2b.uni.ite50.vec Bichar emb: None Gaz file: data/ctb.50d.vec Model saved to: weibomodel.all/saved_model Load gaz file: data/ctb.50d.vec total size: 704368 gaz alphabet size: 10798 gaz alphabet size: 12235 gaz alphabet size: 13671 build word pretrain emb... Embedding: pretrain word:11327, prefect match:3281, case_match:0, oov:75, oov%:0.0223413762288 build biword pretrain emb... Embedding: pretrain word:0, prefect match:0, case_match:0, oov:42646, oov%:0.999976551692 build gaz pretrain emb... Embedding: pretrain word:704368, prefect match:13669, case_match:0, oov:1, oov%:7.31475385853e-05 Training model... DATA SUMMARY START: Tag scheme: BMES MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Use bigram: False Word alphabet size: 3357 Biword alphabet size: 42647 Char alphabet size: 3357 Gaz alphabet size: 13671 Label alphabet size: 29 Word embedding size: 50 Biword embedding size: 50 Char embedding size: 30 Gaz embedding size: 50 Norm word emb: True Norm biword emb: True Norm gaz emb: False Norm gaz dropout: 0.5 Train instance number: 1350 Dev instance number: 270 Test instance number: 270 Raw instance number: 0 Hyperpara iteration: 100 Hyperpara batch size: 1 Hyperpara lr: 0.015 Hyperpara lr_decay: 0.05 Hyperpara HP_clip: 5.0 Hyperpara momentum: 0 Hyperpara hidden_dim: 200 Hyperpara dropout: 0.5 Hyperpara lstm_layer: 1 Hyperpara bilstm: True Hyperpara GPU: True Hyperpara use_gaz: True Hyperpara fix gaz emb: False Hyperpara use_char: False DATA SUMMARY END. Data setting saved to file: weibomodel.all/saved_model.dset build batched lstmcrf... build batched bilstm... build LatticeLSTM... forward , Fix emb: False gaz drop: 0.5 load pretrain word emb... (13671, 50) build LatticeLSTM... backward , Fix emb: False gaz drop: 0.5 load pretrain word emb... (13671, 50) build batched crf... finished built model.

jiesutd commented 5 years ago

Please provide the full log file which includes the log during the training,

gloria0108 commented 5 years ago

The full log file is attached below. Thank you. log.weibo.all.txt

jiesutd commented 5 years ago

看起来没什么问题,有点奇怪,等我回去在我的备份硬盘里找找当时的log对比下。

gloria0108 commented 5 years ago

好的,非常感谢!

jiesutd commented 5 years ago

你好,我查了我的原始log, 发现在weibo数据上我的lattice dropout 是01.而不是0.5, 这是因为weibo数据太小,大的dropout 效果不好。你将下面的参数设置成0.1试试: https://github.com/jiesutd/LatticeLSTM/blob/24d17f4270f11d2f75046789d8b67eaa2b907dce/main.py#L426

gloria0108 commented 5 years ago

好的,我尝试一下,谢谢!

jiesutd commented 5 years ago

不客气,欢迎分享结果

gloria0108 commented 5 years ago

我修改dropout重新跑了之后,weibo数据ne/nm/overall结果分别为54.60,62.37,57.87(论文中汇报结果分别为53.04,62.25,58.79)。其中ne和nm跑出来都比论文汇报结果高,overall上低一些,应该是受随机种子影响。再次感谢您的耐心解答!

jiesutd commented 5 years ago

感谢反馈

Study-ym commented 4 years ago

作者您好: 我用weibo数据集(BMEOS模式)在您的原始代码(参数等均未植入)上跑了lattice model的实验,在NE,NM,ALL测试集上的F值都没有达到您的论文汇报的结果(分别为:51.77vs53.04,60.00vs62.25,56.00vs58.79)。 我查看了之前的问题,看到您回复了weibo数据规模小,因此存在严重的问题。因此我又尝试了8个不同的随机种子,在weibo的所有数据集上跑了实验,发现在weibo.all的测试上最好的结果能达到57.03,仍然低于您论文汇报的结果58.79。请问我还可以进行某些尝试以达到您论文汇报的结果?您论文的结果使用的是代码里的参数吗,我是否还需要调参?非常感谢!

问下,NE、NM、ALL是怎么区分的呢,分别对应哪些数据