bgilbert / anonymize-slide

Delete the label from a whole-slide image
GNU General Public License v2.0
57 stars 45 forks source link

also removes the macro image from aperio slides #1

Open koltenpearson opened 6 years ago

koltenpearson commented 6 years ago

At times the macro could include a snippet of the label

jetic83 commented 6 years ago

Thank you, @koltenpearson , for that commit, I was also looking for deleting the macro, since it sometimes has scanned parts of PHI on it.

But while the label file is just a white image after deletion, the macro image is a wired distorted black white image (example attached). Do you, or @bgilbert know why?

The deletion of the label uses a expected_prefix (LZW_CLEARCODE), but the deletion of the macro uses None expected_prefix. Maybe that is a reason?

image

koltenpearson commented 6 years ago

I tried leaving in the LZW_CLEARCODE at first, but it would not work when I did so. It seemed like the expected_prefix was something to check for before performing the delete. Unfortunately I do not have a complete understanding of how the file format works, so maybe @bgilbert could give a better explanation of what is going on. After running the script I was unable to see either the label or the macro in the associated_images map using openslides (python bindings) so I assumed it was gone. The image you show makes me worry that something might be recoverable after all

AndrewNorgan commented 5 years ago

@bgilbert Removing the Macro (in addition to the Label) would be very helpful for anonymizing. Is this the right approach?