I was comparing the performance of using Presidio directly vs using the Detect PII validator via Guardrails. In most cases, I found that there is difference of 1/10th of a second with using Presidio directly performing better than Detect PII. Both used the default model (en_web_core_lg) and on the same dataset. Wanted to understand if this is due to the additional Guardrails wrappers or am I missing something.
Hi,
I was comparing the performance of using Presidio directly vs using the Detect PII validator via Guardrails. In most cases, I found that there is difference of 1/10th of a second with using Presidio directly performing better than Detect PII. Both used the default model (en_web_core_lg) and on the same dataset. Wanted to understand if this is due to the additional Guardrails wrappers or am I missing something.
Example dataset: https://github.com/microsoft/presidio-research/blob/master/data/synth_dataset_v2.json
PII entities: