NCATS-Gamma / omnicorp

MIT License
1 stars 1 forks source link

annotation false positive #16

Open balhoff opened 5 years ago

balhoff commented 5 years ago

See https://github.com/NCATS-Gamma/robokop/issues/390

gaurav commented 5 years ago

I've found the code in Scigraph that trims English stop words from the start and end of matches. "a" is indeed on the list of Lucene's English stop words.

I'm not sure if there's an easier fix for this than to add a configuration setting to our fork of Scigraph. We could try overloading EntityProcessorImpl and writing our own version of the getAnnotations() method, but that seems unnecessarily complicated.