taishan1994 / pytorch_chinese_biaffine_ner

使用biaffine的中文命名实体识别
9 stars 0 forks source link

用自己的数据跑数据特别低是为什么,几乎识别不出来了,是因为我把最大长度改成512这个原因吗 #1

Open wangyuguo1 opened 1 day ago

wangyuguo1 commented 1 day ago

【train】Epoch: 10/10 Step: 668/670 loss: 0.00103 【train】Epoch: 10/10 Step: 669/670 loss: 0.00121 【train】Epoch: 10/10 Step: 670/670 loss: 0.00136 [eval] precision=0.0341 recall=0.5020 f1_score=0.0639 precision recall f1-score support

风险类别       0.02      0.19      0.03       293
污染来源       0.01      0.40      0.02       241
分析方法       0.02      0.38      0.03       480

区域土壤类型 0.01 0.33 0.02 414 研究区域 0.08 0.50 0.14 225 污染物 0.04 0.59 0.08 2010

micro-f1 0.03 0.50 0.06 3663

if name == 'main': class Args: data_name = "tur" data_dir = 'D:/NER/pytorch_chinese_biaffine_ner-main/pytorch_chinese_biaffine_ner-main/data/{}/'.format(data_name)

train_path = os.path.join(data_dir, "train.json")
dev_path = os.path.join(data_dir, "dev.json")
test_path = os.path.join(data_dir, "test.json")
bert_dir = "D:/NER/pytorch_chinese_biaffine_ner-main/pytorch_chinese_biaffine_ner-main/model_hub/chinese-bert-wwm-ext"
save_dir = "D:/NER/pytorch_chinese_biaffine_ner-main/pytorch_chinese_biaffine_ner-main/checkpoints/{}/model.pt".format(data_name)
ffnn_size = 256
max_seq_len = 512
train_epoch = 10
train_batch_size = 12
eval_batch_size= 12
eval_step = 100
lr = 3e-5
other_lr = 2e-3
adam_epsilon = 1e-8
warmup_proportion = 0.1
max_grad_norm = 1
weight_decay = 0.01
num_cls = 7#9
bias = True
taishan1994 commented 1 day ago

数据量有多少。

wangyuguo1 commented 1 day ago

不多1000条,但是我看别的模型识别的还可以基本上f1值也有80多

taishan1994 commented 22 hours ago

这类模型可能数据要多一点,然后训练更久一些。