训练准确率问题 - Githubissues

macanv / BERT-BiLSTM-CRF-NER

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

https://github.com/macanv/BERT-BiLSMT-CRF-NER

4.7k stars 1.26k forks source link

训练准确率问题 #96

Closed ArtistScript closed 5 years ago

ArtistScript commented 5 years ago

你好，能不能帮我看一下，训练出来的准确率为什么这么低，batchsize从16到32都试过，准确率都很低。训练数据原封来自：https://github.com/zjy-ucas/ChineseNER里的data 训练参数： bert-base-ner-train \ -data_dir data\ -output_dir output\ -init_checkpoint chinese_L-12_H-768_A-12/bert_model.ckpt.index\ -bert_config_file chinese_L-12_H-768_A-12/bert_config.json \ -vocab_file chinese_L-12_H-768_A-12/vocab.txt \ -batch_size 32 \ -num_train_epochs=8.0 \ -dropout_rate=0.5 \ -max_seq_length=128 训练结果：有个奇怪的地方是phraces两处不一样，我看很多人的结果两个phraces是相差不大的

ArtistScript commented 5 years ago

求各位解答一下TT

ArtistScript commented 5 years ago

没有人能解答一下吗

macanv commented 5 years ago

在train helper.py里头，对一些参数怎么用是有一个很具体的而且正确的默认值，你只需要按照这个改成你自己的目录就行。但是在你的训练输入参数里，initcheckpoint存在问题。

macanv commented 5 years ago

而且结果正如你红线标注，数据有问题，你check一下呗。

lichunnan commented 5 years ago

你好，能不能帮我看一下，训练出来的准确率为什么这么低，batchsize从16到32都试过，准确率都很低。训练数据原封来自：https://github.com/zjy-ucas/ChineseNER里的data 训练参数： bert-base-ner-train -data_dir data -output_dir output -init_checkpoint chinese_L-12_H-768_A-12/bert_model.ckpt.index -bert_config_file chinese_L-12_H-768_A-12/bert_config.json -vocab_file chinese_L-12_H-768_A-12/vocab.txt -batch_size 32 -num_train_epochs=8.0 -dropout_rate=0.5 -max_seq_length=128 训练结果：有个奇怪的地方是phraces两处不一样，我看很多人的结果两个phraces是相差不大的

我的用的data和你一样，准确率也很低，请问你问题解决了么？

ArtistScript commented 5 years ago

你好，能不能帮我看一下，训练出来的准确率为什么这么低，batchsize从16到32都试过，准确率都很低。训练数据原封来自：https://github.com/zjy-ucas/ChineseNER里的data 训练参数： bert-base-ner-train -data_dir data -output_dir output -init_checkpoint chinese_L-12_H-768_A-12/bert_model.ckpt.index -bert_config_file chinese_L-12_H-768_A-12/bert_config.json -vocab_file chinese_L-12_H-768_A-12/vocab.txt -batch_size 32 -num_train_epochs=8.0 -dropout_rate=0.5 -max_seq_length=128 训练结果：有个奇怪的地方是phraces两处不一样，我看很多人的结果两个phraces是相差不大的

我的用的data和你一样，准确率也很低，请问你问题解决了么？

没有额