Closed tompollard closed 8 years ago
From @li-lcp
We have fine-tuned our de-id algorithm, previously described in Neamatullah et al, to the current dataset through an iterative manual review and development process; in each iteration, regular expression filters were calibrated, and the look-up dictionaries were expanded until all known PHIs identified in the review process were removed. Using this iterative manual review and development process, we have given scrupulous attention to the task of locating and removing all PHI so that the remaining data can be considered de-identified. Nevertheless, because of the richness and detail of the database, the de-identified data set is released only to legitimate researchers under a data user agreement.