bgilbert / anonymize-slide

Delete the label from a whole-slide image
GNU General Public License v2.0
57 stars 45 forks source link

Remove Metadata such as scan date, image ID #2

Open jetic83 opened 6 years ago

jetic83 commented 6 years ago

Is there a way to delete certain fields in the scans such as comments or scan date or filename? These metainformation can be considered as PHI, or as non-anonymized data.

a-dev-walker commented 4 years ago

Has anyone ever followed up on this issue because it would make anonymize-slide much more useful. As of now, the remaining PHI within the metadata are proving to be an issue.

markemus commented 4 years ago

It can be done. Examples from a PR:

fh.directories[1].entries[XMLPACKET].overwrite_entry(our_xmp) fh.directories[1].entries[IMAGE_DESCRIPTION].overwrite_entry(our_image_desc)

These overwrite the XMLPACKET and IMAGE_DESCRIPTION tags on directory #1. To delete, overwrite with safe values. Make sure to overwrite with a message the same length as the original otherwise the data might still be there.

Tomatenbiss commented 1 year ago

@jetic83, @a-dev-walker: Within the EMPAIA project, we have now developed our own solution for anonymizing WSIs (in various formats) including all the sensitive metadata. This is currently available via Gitlab . The paper for this is currently in review, the preprint can already be viewed at arXiv.