SMI / dicompixelanon

DICOM Pixel Anonymisation
3 stars 0 forks source link

Detect scanned forms #28

Open howff opened 1 year ago

howff commented 1 year ago

We need a way to detect scanned forms which are not clinical images. Typically they contain a lot of text and, even though this is detected and redacted by OCR, there is a risk that some is handwritten (even signatures) and is not redacted.

During CR,DX analysis some rules were found https://git.ecdf.ed.ac.uk/SMI/service/-/issues/188

During MG analysis we discovered that it's not so easy to distinguish forms because of a lack of consistency. More investigation is required.

A ML model has been trained to detect scanned forms, but it will need to be retrained or finetuned on the new MG data.

https://github.com/SMI/dicompixelanon/blob/main/src/testing/learn_forms.md