Facilitate automated redaction of personally identifying information

There are a great many datasets that either contain or may contain personally identifying information (PII). There is some commercial software for finding and auto-redacting this information, but they're all bolt-ons—they don't fit into automated patterns for the publication of open data.

We need solutions to this problem that can be slotted into the data pipeline. Some can be fully automated and some will require human review.

opendata / Open-Data-Needs

Facilitate automated redaction of personally identifying information #12