Open LaVineChan opened 4 years ago
@LaVineChan 把你跑的数据 参数说明看看
我是直接运行的默认的cluener数据集,参数也是没有改动,直接使用bash scripts/run_ner_span.sh运行程序,这个有可能是因为早停导致的吗
请问有解决这个问题吗,我也碰到了这个问题,分类的准确率很低
没有解决,官方后续也没有回复我
在 2020-11-23 17:44:59,"leilai li" notifications@github.com 写道:
请问有解决这个问题吗,我也碰到了这个问题,分类的准确率很低
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@LaVineChan 很明显模型文件不对,这是中文,不是英文,中文模型应该是没有uncased后缀的。
请问你们有遇到这种问题不?另外tokenizer那几个文件是需要在哪里下载么?
[root@bogon BERT-NER-Pytorch-master]# bash scripts/run_ner_crf.sh 12/01/2020 14:56:42 - WARNING - root - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False 12/01/2020 14:56:42 - INFO - models.transformers.configuration_utils - loading configuration file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/config.json 12/01/2020 14:56:42 - INFO - models.transformers.configuration_utils - Model config { "attention_probs_dropout_prob": 0.1, "dropout": 0.1, "emb_size": 768, "feedforward_size": 3072, "finetuning_task": null, "heads_num": 12, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "layers_num": 12, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 12, "num_labels": 34, "output_attentions": false, "output_hidden_states": false, "output_past": true, "pruned_heads": {}, "torchscript": false, "type_vocab_size": 2, "use_bfloat16": false, "vocab_size": -1 }
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - Model name '/home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). Assuming '/home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base' is a path or url to a directory containing tokenizer files.
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - Didn't find file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/added_tokens.json. We won't load it.
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - Didn't find file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/special_tokens_map.json. We won't load it.
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - Didn't find file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/tokenizer_config.json. We won't load it.
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - loading file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/vocab.txt
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - loading file None
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - loading file None
12/01/2020 14:56:42 - INFO - models.transformers.tokenization_utils - loading file None
12/01/2020 14:56:42 - INFO - models.transformers.modeling_utils - loading weights file /home/Smile_L/BERT-NER-Pytorch-master/prev_trained_model/bert-base/pytorch_model.bin
Traceback (most recent call last):
File "run_ner_crf.py", line 496, in
运行run_ner_span,两个epoch就结束了,而且准确率很低??是我的运行参数设置错误吗 07/25/2020 03:30:00 - INFO - root - Eval results 07/25/2020 03:30:00 - INFO - root - acc: 0.5564 - recall: 0.1478 - f1: 0.2335 - loss: 0.2043 07/25/2020 03:30:00 - INFO - root - Entity results 07/25/2020 03:30:00 - INFO - root - * address results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.5385 - recall: 0.0563 - f1: 0.1019 07/25/2020 03:30:00 - INFO - root - * book results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.6000 - recall: 0.0974 - f1: 0.1676 07/25/2020 03:30:00 - INFO - root - * company results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.4541 - recall: 0.2354 - f1: 0.3101 07/25/2020 03:30:00 - INFO - root - * game results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.6018 - recall: 0.6814 - f1: 0.6391 07/25/2020 03:30:00 - INFO - root - * government results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.5000 - recall: 0.0405 - f1: 0.0749 07/25/2020 03:30:00 - INFO - root - * movie results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.6875 - recall: 0.2185 - f1: 0.3317 07/25/2020 03:30:00 - INFO - root - * name results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.4324 - recall: 0.0688 - f1: 0.1187 07/25/2020 03:30:00 - INFO - root - * organization results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.6769 - recall: 0.1199 - f1: 0.2037 07/25/2020 03:30:00 - INFO - root - * position results **** 07/25/2020 03:30:00 - INFO - root - acc: 1.0000 - recall: 0.0139 - f1: 0.0273 07/25/2020 03:30:00 - INFO - root - * scene results **** 07/25/2020 03:30:00 - INFO - root - acc: 0.3333 - recall: 0.0144 - f1: 0.0275 然后就自动结束了