Open didmar opened 1 year ago
Hi, yes this is indeed something that could be improved and a PR would be great. I believe it is a duplicate of #739
Hi @didmar have you started working on this? If not, I'd like to give it a go.
Hi @marcjulianschwarz, I ended up creating my own PatternRecognizer class to address this and make other tweaks for my use case, so not something that would be interesting as a PR I believe.
For reference, I simply changed one line the __analyze_patterns
method like so:
...
for match in matches:
# Modified here to use captured group
start, end = match.span(1)
...
Also check out the duplicate #739, which suggests a more general way to handle this.
Is your feature request related to a problem? Please describe. In PatternRecognizer, it is not possible to use the span of a given group in a pattern, rather than using the entire span. For example using regex
"Password: (\w+)"
, PatternRecognizer will anonymize "Password: 1234", but I would like it to only anonymize "1234"Describe the solution you'd like A way to specify a group to use, e.g.,
start, end = match.span(pattern.use_group) if pattern.use_group else match.span()
. Can do a PR if needed.