MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.43k stars 1.5k forks source link

Disease names are anonymised #1548

Open JoakimEdin opened 1 year ago

JoakimEdin commented 1 year ago

Prerequisites

Description

I believe this is a difficult error to fix as it's related to the algorithm to de-identify the clinical notes. I've seen multiple examples of disease names being anonymized since they contain a name in them—for example, Lewis Body Dementia, Creutzfeldt-Jakob disease, etc. This makes the data noisy for automated medical coding, which many are using it for. Will it be possible to fix this issue?

Example:

hadm_id: 23052089 from MIMIC-IV

ACUTE/ACTIVE ISSUES:

disease Body Dementia

Visual Hallucinations

The patient appears to have acute on chronic progression of his ___ disease. Unclear if this is disease progression or

alistairewj commented 1 year ago

I do have some ideas for how to improve this. Nothing immediately available but it's good to have reports so we can build up our set of test cases, thanks. Didn't expect Lewy Body Dementia to be deid but it does make sense in hind sight!