Details of deidentification process

MIT-LCP / mimic-iii-paper

Repository for the paper describing MIMIC-III

75 stars 37 forks source link

From @li-lcp

We have fine-tuned our de-id algorithm, previously described in Neamatullah et al, to the current dataset through an iterative manual review and development process; in each iteration, regular expression filters were calibrated, and the look-up dictionaries were expanded until all known PHIs identified in the review process were removed. Using this iterative manual review and development process, we have given scrupulous attention to the task of locating and removing all PHI so that the remaining data can be considered de-identified. Nevertheless, because of the richness and detail of the database, the de-identified data set is released only to legitimate researchers under a data user agreement.

MIT-LCP / mimic-iii-paper

Details of deidentification process #16