Lotte-W / Digital-Preservation-Headaches

Digital Preservation Headaches
10 stars 0 forks source link

Preparing for Access: Detecting Sensitive Info in Born-digital #1

Open Lotte-W opened 2 years ago

Lotte-W commented 2 years ago

How to automatically detect sensitive (private) information in born-digital documents/spreadsheets/email/.../metadata?

Asbjoedt commented 2 years ago

I don't think this can be done automatically, except for national ID numbers, which typically follow a certain scheme. We currently don't use any tool for this though.

It is very much a manual and vital process, that serves two purposes:

  1. Protect sensitive information
  2. Allow information to be shared, if they are not sensitive

It is difficult to administrate and perform workflows on individual files for this purpose, consider therefore to administrate on data package level despite the less detail. This of course means some not-sensitive data will be restricted as part of a package with any sensitive data. This administration will make the manual workflow easier to manage.