Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.25k stars 4.59k forks source link

[QUERY] Canadian Personal Health Numbers not being detected by Text Analytics RecognizePiiEntitiesAsync method #42254

Open YEGCSharpDev opened 6 months ago

YEGCSharpDev commented 6 months ago

Library name and version

Azure.AI.TextAnlaytics

Query/Question

When sending Canadian PHN(Personal Health Number)s, both in province specific format and without a defined format, PHNs are not detected by RecognizePiiEntitiesAsync method.

SIN (Social Insurance Number) is being detected with 96% confidence score. Sometimes PHN is also being picked up as SIN, but 100% of the time no PHN is being detected.

Formats for PHN from all provinces were fed to the client and none of them were detected.

The following categories were added when calling RecognizePiiEntitiesAsync method.

We are wondering if the service is being used correctly by us. Here is the snippet of code that calls the service and returns results.

RecognizePiiEntitiesOptions options = new() {CategoriesFilter = { PiiEntityCategory.CAHealthServiceNumber ,PiiEntityCategory.CASocialInsuranceNumber, PiiEntityCategory.CAPersonalHealthIdentification } };
PiiEntityCollection entities = await client.RecognizePiiEntitiesAsync(document: documentData, options: options);
    if (entities.Count > 0)
        {
            foreach (PiiEntity entity in entities)
            {
                textsToRedact.Add(entity.Text);
                Console.WriteLine($"Text: {entity.Text}, Category: {entity.Category}, SubCategory: {entity.SubCategory}, Confidence score: {entity.ConfidenceScore}");
            }
        }

Sample PHNs sent to the service (none of these values are actual values)

Environment

OS and .NET runtime version

OS :

.NET SDK: Version: 8.0.101 Commit: 6eceda187b Workload version: 8.0.100-manifests.30fce108

Runtime Environment:

OS Name: Windows OS Version: 10.0.19045 OS Platform: Windows RID: win-x64 Base Path: C:\Program Files\dotnet\sdk\8.0.101\

Host:

Version: 8.0.1 Architecture: x64 Commit: bf5e279d92 .NET SDKs installed: 8.0.101 [C:\Program Files\dotnet\sdk]

IDE

Visual Studio V17.8.6

github-actions[bot] commented 6 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @assafi @quentinRobinson @wangyuantao.

YEGCSharpDev commented 2 months ago

It's been a few months since this issue was opened. Following up.