Closed Siddhartha90 closed 9 months ago
Hi @Siddhartha90 I have verified your code with mentioned versions, its working as expected. It successfully redacting the phone number. Please check your spacy model, does it modified or trained by something. You can also verify the same on demo site as well, here its link - https://huggingface.co/spaces/presidio/presidio_demo
@Siddhartha90 can you please provide a reproducible example? Both samples seem identical.
Closing for now, please re-open if needed.
@VMD7 @omri374 Apologies, i updated the description with the correct example, I was unable to reopen the issue, so i created a new one - https://github.com/microsoft/presidio/issues/1301
Hi @Siddhartha90 Thanks for your response. Actually the problem is not with the anonymizer or analyzer engine. The main problem lies in the model we are using to predict the entity. You can try with little larger models such as "stanford-deidentifier-base" or "deid_roberta_i2b2". You can experiment with different models and use according to your need. If you wanna use spacy model only then you can train or finetune the spacy model with these kind of examples and use the same, you will get results as expected.
Describe the bug
test data 630-596-1111
redacts ok to give back
test data
but
test data630-596-1111
does not redact the phone number.
To Reproduce
Expected behavior Phone number gets scrubbed.
I'm on
presidio-analyzer==2.2.351 presidio-anonymizer==2.2.351