Georgetown-IR-Lab / QuickUMLS

System for Medical Concept Extraction and Linking
MIT License
369 stars 95 forks source link

Fix empty final ngram #31

Closed burgersmoke closed 5 years ago

burgersmoke commented 5 years ago

When running with ignore_syntax=True I was often getting division by zero errors. In tracking this back, the method that breaks up ngrams was not skipping tokens which were under the minimum length like is done in _make_ngrams. Adding a check for this prevents the div-by-zero error.

soldni commented 5 years ago

I would prefer any example usage contained to the documentation (i.e.,README.md file) instead than having them in Python files. could you amend this pull request to remove sample.py? Thanks!

burgersmoke commented 5 years ago

Absolutely. Good catch on that. I'll amend this.