MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.51k stars 1.5k forks source link

MIMIC-IV Notes: Censoring of echo observations as PHI #1507

Open alexander-lacki opened 1 year ago

alexander-lacki commented 1 year ago

Prerequisites

Description

Hi, thank you very much for providing and maintaining this database!

I would like to point to an issue that appears to be present only in MIMIC-IV, that was not a problem in MIMIC-III. In our project we are operating on the free-text clinical notes. In MIMIC-III these used to include information regarding atrial dilation, which could easily be string matched. The notes were written in the form:

LEFT ATRIUM: Mild LA enlargement LEFT ATRIUM: Moderate LA elongation

In MIMIC-IV, it appears that such information was flagged as PHI, and replaced with three underscores in the discharge reports. At least this is my understanding. What is now present in the notes is:

LEFT ATRIUM: Mild LEFT ATRIUM: Moderate

Is this a known issue? Is there a possibility you may bring this information back?

alistairewj commented 1 year ago

We overhauled the deidentification process so there will be changes, mostly for the better. Raising issues like these are useful as our approach includes the ability to suppress false positives based on domain knowledge. So we could add a few context rules for these cases you've identified.

burgersmoke commented 1 year ago

Would it be helpful to report more instances which might be false positives? I've noticed in some chief complaints, some crucial information is missing such as these:

CC: ___ pain CC: ___, Wound Eval CC: ___

alistairewj commented 1 year ago

Yes, and including the note_id will make it easier to add these to the set of test cases we have. Thank you!

EDIT: for chiefcomplaint, the stay_id would work since it's the primary key of that table