Open troy256 opened 5 months ago
:: is a valid ipv6 address The solution might be to split the regex into two and drop the score for :: Anyone up for fixing it?
Even though :: is a valid IPv6 address, it's not personally identifiable and is effectively anonymous. So maybe skip over it?
A simple workaround would be to add ::
as an allow_list term.
Can that be done with configuration or is that a code change?
Configuration: https://microsoft.github.io/presidio/tutorial/13_allow_list/
Describe the bug We are indirectly using this library as part of PII detection for text coming from a GenAI based coding assistant. However it is detecting every instance of "::" as PII, because IPv6 addresses can contain this. This string is regularly used in Perl, as well as C++ and PHP. E.g. -
Expected behavior Ideally the IPv6 detection would be smart enough to know the difference between programming language use vs an actual IPv6 address.
Additional context Very similar to Issue #907