pvcy / presidio

MIT License
0 stars 0 forks source link

EntitySource Title recognizer class and implementations #10

Closed willsthompson closed 3 years ago

willsthompson commented 4 years ago

Some PII are not easily recognized by value - i.e. a person's name would need NLP or a very long list and probably still return false negatives. For PII whose values may be too burdensome to identify, we should still inspect metadata like Column (or text field) title.

~Create a new TitleRecognizer abstract class that has the equivalent functionality of privacy-api ColumnNameScanner patterns that don't have a corresponding ValueScanner pattern. ~

Pvcy/presidio now supports title-only recognizers by explicitly tagging title matches in the recognizer impl and checking for the tags when results are created.

Migrate: