nlpie / biomedicus

BioMedICUS: A biomedical and clinical NLP engine.
https://nlpie.github.io/biomedicus/
Apache License 2.0
17 stars 7 forks source link

Latest vocab lacks Novel 2019 Coronavirus terms & concepts #64

Open mikedemick opened 4 years ago

mikedemick commented 4 years ago

Describe the bug The most current data download: biomedicus-3.0b4-umls-license-required-data does not contain Novel 2019 Coronavirus / Covid-19 terms added to the UMLS Metathesaurus vocabs, such as Snomed and MSH, in early 2020. Without these updates the tool does not support analysis on Covid-19 related EHR histories.

To Reproduce Extractions on documents containing references to Covid-19/ Wuhan Virus/ Novel 2019 Coronavirus, etc., and related terms are not detected by the dataset available.

Expected behavior Dataset should be amended to include the newer terms summarized here: https://metamap.nlm.nih.gov/Covid19Terms.shtml.

Terminal Output N/A EnvironmentN/A N/A Additional context Add any other context about the problem here.

benknoll-umn commented 4 years ago

I'm updating our vocabularies to 2020AA and will have a release out this week.

benknoll-umn commented 4 years ago

New version got pushed on Wednesday, go ahead and give it a try.

mikedemick commented 4 years ago

Thanks, really appreciate it, especially the quick turnaround. Still doing some testing, noticing some terms are found, 'wuhan virus', whereas others, 'covid-19' don't seem to get a proper match. Will continue to work with this version-- thanks again.

Mike Demick 919-414-6590

On Fri, Aug 14, 2020 at 1:57 PM Ben Knoll notifications@github.com wrote:

New version got pushed on Wednesday, go ahead and give it a try.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nlpie/biomedicus3/issues/64#issuecomment-674193480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQFH73FDDBUI37ART6OJCUTSAV3IXANCNFSM4PY4K3MA .

benknoll-umn commented 4 years ago

From looking at 2019 AA it may not include those MSH terms listed in the file. I'll confirm and if they aren't I'll manually add them to the UMLS before building the concepts dictionary.