Closed ravishchawla closed 4 years ago
Thanks for asking!
Probably you can modify the value of bias in the crf for this. A reference (similar technique, different application) can be found at: https://arxiv.org/abs/1904.09331
PS: this repo is outdated, you can try the vanillaNER repo for developing new models.
I read through the paper, and looked through the code in train_wc as well as the arguments that can be passed during initialization. One of the issue I am facing is that after fine-tuning the model on my own dataset, the number of keywords that are outputted varies significantly.
Some texts have no keywords, but still have entities that should be found. Other texts would get between 5 - 10 keywords. I am not trying to tune the maximum number of keywords, because I believe that filtering can be done in post-processing by the confidence scores.
I am interested in knowing if there is a way to tune the minimum number of keywords found, or lower the score threshold so more keywords are found in general.