Closed willsthompson closed 3 years ago
I finished a very cursory pass at these tests. Most recognizers need more tests, especially expected non-matching cases and variations with titles. Multiple recognizers need improvement for more robust detection.
In the next phase of testing, these should probably be pushed down a level and tested at the output of engine.analyze()
, instead of pii_report. The pii_report testing should be separated/isolated to define its independent behavior, which may not be much, assuming there are tests covering filter_intersecting_results
, is_categorical
, and ordered type detection/handling. The pii_report tests may be better suited as a schema/packaging validation.
Write tests for the new recognizers created in issue #6
Open question: Should the new recognizers live in the
presidio
repo or in theprivacy-api
repo? I'm leaning towardprivacy-api
to prevent thepresidio
repo from unnecessarily diverging further from the public master. It would also be easier to contribute our updates back to the public repo without including every (or any) custom recognizers.If recognizers are moved, also clean up
presidio
's history.