Closed Imipenem closed 2 years ago
Hi @Imipenem, unfortunately, the public model we have for Negation (ie Status) is not the best as it was trained on a relatively small dataset. I've tested the same example on one of our in-hospital models and everything works, but we cannot make it public as it contains confidential information.
I can only suggest that you train your own model for negation using MedCATtrainer, or wait until we publish one of the better models once we get permission for it (for now I'm not able to estimate when this could be).
Example of the output for your text with one of our internal models:
{'entities': {0: {'pretty_name': 'Diabetes mellitus (disorder)',
'cui': '73211009',
'type_ids': ['T-11'],
'types': ['disorder'],
'source_value': 'diabetes',
'detected_name': 'diabete',
'acc': 0.39457637369632725,
'context_similarity': 0.39457637369632725,
'start': 21,
'end': 29,
'icd10': [],
'ontologies': ['SNOMED'],
'snomed': [],
'id': 0,
'meta_anns': {'Presence': {'value': 'True',
'confidence': 1.0,
'name': 'Presence'},
'Time': {'value': 'Recent',
'confidence': 0.9901728630065918,
'name': 'Time'},
'Subject': {'value': 'Patient',
'confidence': 0.973953902721405,
'name': 'Subject'}}},
1: {'pretty_name': 'Hypertensive disorder, systemic arterial (disorder)',
'cui': '38341003',
'type_ids': ['T-11'],
'types': ['disorder'],
'source_value': 'hypertension',
'detected_name': 'hypertension',
'acc': 0.5329114772595984,
'context_similarity': 0.5329114772595984,
'start': 38,
'end': 50,
'icd10': [],
'ontologies': ['SNOMED'],
'snomed': [],
'id': 1,
'meta_anns': {'Presence': {'value': 'False',
'confidence': 1.0,
'name': 'Presence'},
'Time': {'value': 'Recent',
'confidence': 0.998069167137146,
'name': 'Time'},
'Subject': {'value': 'Patient',
'confidence': 0.9986799955368042,
'name': 'Subject'}}},
2: {'pretty_name': 'Psychotic disorder (disorder)',
'cui': '69322001',
'type_ids': ['T-11'],
'types': ['disorder'],
'source_value': 'psychosis',
'detected_name': 'psychosis',
'acc': 0.3700194746255875,
'context_similarity': 0.3700194746255875,
'start': 52,
'end': 61,
'icd10': [],
'ontologies': ['SNOMED'],
'snomed': [],
'id': 2,
'meta_anns': {'Presence': {'value': 'False',
'confidence': 0.7612127065658569,
'name': 'Presence'},
'Time': {'value': 'Recent',
'confidence': 0.9930446147918701,
'name': 'Time'},
'Subject': {'value': 'Patient',
'confidence': 0.9984433650970459,
'name': 'Subject'}}},
3: {'pretty_name': 'Glaucoma (disorder)',
'cui': '23986001',
'type_ids': ['T-11'],
'types': ['disorder'],
'source_value': 'glaucoma',
'detected_name': 'glaucoma',
'acc': 0.7935367539525032,
'context_similarity': 0.7935367539525032,
'start': 66,
'end': 74,
'icd10': [],
'ontologies': ['SNOMED'],
'snomed': [],
'id': 3,
'meta_anns': {'Presence': {'value': 'False',
'confidence': 0.9999976754188538,
'name': 'Presence'},
'Time': {'value': 'Recent',
'confidence': 0.884349524974823,
'name': 'Time'},
'Subject': {'value': 'Patient',
'confidence': 0.9915496706962585,
'name': 'Subject'}}}},
'tokens': []}
Thanks for your answer, guess this is the way to go then.
I've read in the docs, that if one has access to UMLS or SNOMEDT-CT, one could get access to the cdb and vocab for those.
Would this improve the results as well?
That will improve the results with respect to NER+L, because you will have all of UMLS/SNOMED while the public NER+L models are a subset. But the Meta models (Status), will stay the same.
Thanks for clarification.
Hey,
First of all thanks for the great package.
I'm using medcat 1.2.8 and I noticed the following issue:
Example:
This results in:
As one can see, medcat correctly gets, that there is a diabetes but no hypertension diagnosis. But the "denies" context seems to get lost/ignored in the enumeration after hypertension so psychosis and glaucoma are labeled as "affirmed" although, they should also be "Other" (like negative).
Is this a known Issue? Are there any approaches to solve such issues?
Many thanks in advance ;)