How to change the default 0.85 score for `SpacyRecognizer`?

microsoft / presidio

Context aware, pluggable and customizable data protection and de-identification SDK for text and images

MIT License

3.71k stars 565 forks source link

from presidio_analyzer import AnalyzerEngine from presidio_analyzer.predefined_recognizers import SpacyRecognizer custom_recognizer = SpacyRecognizer(ner_strength=0.25) analyzer = AnalyzerEngine() analyzer.registry.add_recognizer(custom_recognizer) results = analyzer.analyze( text="Alice and Bob", language="en", return_decision_process=True, score_threshold=0.1 ) print(results) print("------") print([res.__dict__ for res in results])

Hi, please see the following code snippet:

from presidio_analyzer import AnalyzerEngine
from presidio_analyzer.nlp_engine import SpacyNlpEngine, NerModelConfiguration

# Define which model to use
model_config = [{"lang_code": "en", "model_name": "en_core_web_lg"}]

ner_model_configuration = NerModelConfiguration(default_score = 0.6)

# Create the NLP Engine based on this configuration
spacy_nlp_engine = SpacyNlpEngine(models= model_config, ner_model_configuration=ner_model_configuration)

analyzer = AnalyzerEngine(nlp_engine=spacy_nlp_engine)
analyzer.analyze(...)

Using the NerModelConfiguration class you can further configure which entities the model returns, how they map to Presidio's entities and more.

https://microsoft.github.io/presidio/analyzer/nlp_engines/spacy_stanza/#how-ner-results-flow-within-presidio

microsoft / presidio

How to change the default 0.85 score for `SpacyRecognizer`? #1372