LHNCBC / metamaplite

A near real-time named-entity recognizer
https://metamap.nlm.nih.gov/MetaMapLite.shtml
Other
58 stars 14 forks source link

Added configuration flag to control minimum token length in EntityLookup5 #33

Closed stevenbedrick closed 1 year ago

stevenbedrick commented 1 year ago

The findLongestMatchingString() method has a hard-coded limit where it will not consider strings shorter than 3 characters as possible matching entities; this certainly makes sense for many scenarios, but there are times when one might actually need to match an entity that is two characters long (for example, the abbreviation "CK" for "creatine kinase" and so I have added a configuration option to make this limit customizable.