Closed po-oya closed 4 years ago
I guess you have an incorrect pretrained embedding format.
pretrain word:1, prefect match:0, case_match:0, oov:18204, oov%:0.9999450700357044
Your training loss is too big, this will make your training unstable and can't converge to good parameters. If it is not caused by the first problem, you need to finetune your hyperparameters or try different embeddings.
Hi, This sounds to be reported by other developers in https://github.com/jiesutd/NCRFpp/issues/100, https://github.com/jiesutd/NCRFpp/issues/82, https://github.com/jiesutd/NCRFpp/issues/80,https://github.com/jiesutd/NCRFpp/issues/60 and https://github.com/jiesutd/NCRFpp/issues/22. I have checked all of those explanations.
I am trying to build and NER system with this Persian corpus. I used the
train_fold1.txt
as the training set,test_fold1.txt
as dev set andtest_fold2.txt
as the test set. The corpus is in IOB format by default. I used thetagSchemeConverte.py
for conversion to BOI format. The corpus includes various labels, there are someB-X
in the labels too.The problem is the model can not predict correct labels of entities, everything sounds to be predicated as
O
. Also, there are lots of tokens withO
tags, but removing many of samples which all tokens are labelled asO
didn't help. Below are my log files:and this is train logs for some epochs ...
This is my configuraions:
For some epochs very small positive F1 has been seen. I thought maybe using different configurations could help the problem but none worked, It would be a great help if you could share your ideas. Thanks