curiosity-ai / catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
MIT License
742 stars 75 forks source link

Incorrect Tagging by AveragePerceptronEntityRecognizer #109

Open jlcanale opened 6 months ago

jlcanale commented 6 months ago

When using the AveragePerceptronEntityRecognizer sometimes it tags words that it shouldn't.

For example in the text:

"In Pennsylvania, divorce proceedings usually start with a domestic relations master who initially hears the case. Should either party find the master's decision unsatisfactory, they have the right to challenge this decision by appealing to a judge for further review."

Should is being recognized as a Person. Is it possible to get a confidence score somewhere, so we can weed these types of things out after the document has been processed?