opendata / Open-Data-Needs

An ongoing effort to catalog the holes in the open data ecosystem. [RETIRED]
15 stars 0 forks source link

Facilitate automated redaction of personally identifying information #12

Open waldoj opened 10 years ago

waldoj commented 10 years ago

There are a great many datasets that either contain or may contain personally identifying information (PII). There is some commercial software for finding and auto-redacting this information, but they're all bolt-ons—they don't fit into automated patterns for the publication of open data.

We need solutions to this problem that can be slotted into the data pipeline. Some can be fully automated and some will require human review.