philterd / phileas

The open source PII and PHI redaction engine
https://www.philterd.ai
Apache License 2.0
24 stars 5 forks source link

Adjust span start based on previous replacements #145

Closed jzonthemtn closed 1 month ago

jzonthemtn commented 1 month ago

The fix for #126 introduced a problem where the spans that needed to be replaced would not have their start/end updated accordingly after doing a prior replacement.

The issue is the code is updating identifiedSpans and not appliedSpans.

Example - the following text:

George Washington was president and his ssn was {{{REDACTED-ssn}}} and he lived at {{{REDACTED-zip-code}}}.

would be redacted as:

George Washington was president and his ssn was {{{REDACTED-ssn}}} and he li{{{REDACTED-zip-code}}}t 90210.

This needs a unit test.

jzonthemtn commented 1 month ago

Backported to 2.7.1.