hankcs / HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
https://hanlp.hankcs.com/
Apache License 2.0
33.6k stars 10.03k forks source link

demo脚本无法复现sota效果 #1773

Closed shixinglingguihua closed 2 years ago

shixinglingguihua commented 2 years ago

Describe the bug 执行sota脚本无法复现效果

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Describe the current behavior 1、下载sota脚本 HanLP/ plugins / hanlp_demo / hanlp_demo / zh / train_sota_bert_pku.py 2、打开python shell 执行 脚本中的python代码 代码如下: from hanlp.common.dataset import SortingSamplerBuilder from hanlp.components.tokenizers.transformer import TransformerTaggingTokenizer from hanlp.datasets.tokenization.sighan2005.pku import SIGHAN2005_PKU_TRAIN_ALL, SIGHAN2005_PKU_TEST from tests import cdroot

cdroot() tokenizer = TransformerTaggingTokenizer() save_dir = 'data/model/cws/sighan2005_pku_bert_base_96.66' tokenizer.fit( SIGHAN2005_PKU_TRAIN_ALL, SIGHAN2005_PKU_TEST, # Conventionally, no devset is used. See Tian et al. (2020). save_dir, 'bert-base-chinese', max_seq_len=300, char_level=True, hard_constraint=True, sampler_builder=SortingSamplerBuilder(batch_size=32), epochs=10, adam_epsilon=1e-6, warmup_steps=0.1, weight_decay=0.01, word_dropout=0.1, seed=1609422632, ) tokenizer.evaluate(SIGHAN2005_PKU_TEST, save_dir)

Expected behavior A clear and concise description of what you expected to happen.

System information

Other info / logs 623/623 loss: 1416.3794 P: 59.23% R: 65.39% F1: 62.16% ET: 2 m 6 s
63/63 loss: 451.0509 P: 89.72% R: 89.54% F1: 89.63% ET: 4 s
2 m 9 s / 21 m 33 s ETA: 19 m 24 s (saved) Epoch 2 / 10: 623/623 loss: 286.6173 P: 31.52% R: 63.34% F1: 42.09% ET: 2 m 8 s
63/63 loss: 486.3771 P: 0.37% R: 8.27% F1: 0.71% ET: 4 s
4 m 22 s / 21 m 50 s ETA: 17 m 28 s (1) Epoch 3 / 10: 623/623 loss: 211.5965 P: 21.55% R: 61.44% F1: 31.91% ET: 2 m 8 s
63/63 loss: 470.1510 P: 0.39% R: 8.66% F1: 0.75% ET: 4 s
6 m 34 s / 21 m 53 s ETA: 15 m 19 s (2) Epoch 4 / 10: 623/623 loss: 173.5003 P: 16.42% R: 59.68% F1: 25.75% ET: 2 m 9 s
63/63 loss: 469.0070 P: 0.37% R: 8.32% F1: 0.71% ET: 4 s
8 m 47 s / 21 m 57 s ETA: 13 m 10 s (3) Epoch 5 / 10: 623/623 loss: 149.7656 P: 13.30% R: 58.04% F1: 21.63% ET: 2 m 9 s
63/63 loss: 482.1117 P: 0.38% R: 8.61% F1: 0.73% ET: 4 s
10 m 59 s / 21 m 59 s ETA: 10 m 59 s (4) Epoch 6 / 10: 623/623 loss: 131.2736 P: 11.19% R: 56.51% F1: 18.69% ET: 2 m 9 s
63/63 loss: 530.2560 P: 0.36% R: 8.13% F1: 0.69% ET: 4 s
13 m 12 s / 22 m 0 s ETA: 8 m 48 s (5) early stop Max score of dev is P: 89.72% R: 89.54% F1: 89.63% at epoch 1 Average time of each epoch is 2 m 12 s 13 m 12 s elapsed P: 89.72% R: 89.54% F1: 89.63%

tokenizer.evaluate(SIGHAN2005_PKU_TEST, save_dir) Pruned 0 (0.0%) samples out of 2004.
63/63 loss: 451.0509 P: 89.72% R: 89.54% F1: 89.63% ET: 4 s
speed: 531 samples/second (P: 89.72% R: 89.54% F1: 89.63%, (451.0508732871404, P: 89.72% R: 89.54% F1: 89.63%))

hankcs commented 2 years ago

第一时间响应:这段脚本适配早期版本2.1.0-alpha.0,训练的SOTA模型和日志公开下载:https://od.hankcs.com/hanlp/data/tok/sighan2005_pku_bert_base_zh_20201231_141130.zip

安装pip install hanlp==2.1.0-alpha.0后暂时只能达到P: 96.91% R: 96.09% F1: 96.50%,估计与transformers等第三方库有关。正在排查问题。

hankcs commented 2 years ago

成功复现,的确与第三方库的版本有关。需要安装如下版本:

!pip install transformers==3.5.1
!pip install torch==1.6.0
!pip install hanlp==2.1.0-alpha.0

然后运行该脚本后即可复现,日志为:

Model built with 102270724/102270724 trainable/total parameters.
Using GPUs: [1]
19922/2004 samples in trn/dev set.
Epoch 1 / 3:
623/623 loss: 1016.5334 P: 84.75% R: 83.30% F1: 84.02% ET: 1 m 58 s
  63/63 loss: 348.9937 P: 96.49% R: 95.60% F1: 96.04% ET: 4 s
2 m 2 s / 6 m 6 s ETA: 4 m 4 s (saved)
Epoch 2 / 3:
623/623 loss: 244.4627 P: 90.65% R: 89.83% F1: 90.24% ET: 2 m 0 s
  63/63 loss: 355.4353 P: 96.61% R: 96.33% F1: 96.47% ET: 4 s
4 m 6 s / 6 m 10 s ETA: 2 m 3 s (saved)
Epoch 3 / 3:
623/623 loss: 187.5929 P: 92.85% R: 92.28% F1: 92.57% ET: 1 m 57 s
  63/63 loss: 341.4750 P: 96.93% R: 96.39% F1: 96.66% ET: 4 s
6 m 8 s / 6 m 8 s ETA: 0 s (saved)
Max score of dev is P: 96.93% R: 96.39% F1: 96.66% at epoch 3
Average time of each epoch is 2 m 3 s
6 m 8 s elapsed
63/63 loss: 341.4750 P: 96.93% R: 96.39% F1: 96.66% ET: 4 s
speed: 568 samples/second
Model saved in data/model/cws/sighan2005_pku_bert_base_96.66

你也可以在colab上复现该实验:https://colab.research.google.com/drive/12w6qmHg0xyrvnRHOE7oTehRRD_5ZCBlI?usp=sharing