microsoft / presidio

Context aware, pluggable and customizable data protection and de-identification SDK for text and images
https://microsoft.github.io/presidio
MIT License
3.88k stars 578 forks source link

Using protocol buffers to add new parameters #255

Closed LaraSchvartzman closed 4 years ago

LaraSchvartzman commented 4 years ago

Hi! I'm using and editing the Presidio analyzer software as a python module to fit my requirements. One of the things I'd like to do is to add a parameter to the process, for example, passing the country to the AnalyzerEngine().analyze method (in addition to the existing parameters like text, language, entities, etc). I managed to change the code for this purpose, adding a new parameter to the analyze method of entity_recognizer, local_recognizer, pattern_recognizer and a few other scripts, I did this very carefully but it doesn't work because some of the _pb2_grpc automatically generated scripts (using Google's protocol buffer) control the supported variables. I was wondering if there's a way of regenerating this scripts using the existing (edited) code to add supported parameters. Thank you so much for the attention!

omri374 commented 4 years ago

Hi @LaraSchvartzman On the Presidio-Genproto repo, you can find some documentation on how to manipulate the proto files and generate new ones: https://github.com/microsoft/presidio-genproto#changing-presidios-api

You would have to copy the new pb2 files into the analyzer folder once you generated new ones.

LaraSchvartzman commented 4 years ago

Thanks @omri374 !!! It's exactly what I needed.