MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.26k stars 21.44k forks source link

Cannot find examples for prebuilt entities via API #118591

Closed AdamMiltonBarker closed 9 months ago

AdamMiltonBarker commented 9 months ago

Please can you provide the link to examples of prebuilt entity components in JSON?


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

AdamMiltonBarker commented 9 months ago

Worked it out.

AjayBathini-MSFT commented 9 months ago

Closed by Customer

AdamMiltonBarker commented 9 months ago

These are matches for Quantity.Numbers?

0
: 
{category: 'Product', text: 'white trainers', offset: 17, length: 14, confidenceScore: 1}
1
: 
{category: 'PriceRangeEntity', text: 'between', offset: 41, length: 7, confidenceScore: 1}
2
: 
{category: 'PriceRangeEntity', text: '10', offset: 49, length: 2, confidenceScore: 1, …}
3
: 
{category: 'PriceRangeEntity', text: '50', offset: 63, length: 2, confidenceScore: 1, …}

Is there a way to simply get numbers? This is crazy it is classifying full strings like 123 and 345 pounds? as numbers. All of our dataset is correctly labelled.


      {
        "category": "PriceRangeEntity",
        "compositionSetting": "combineComponents",
        "prebuilts": [
          {
            "category": "Quantity.Number"
          }
        ]
      },

      {
        "text": "Do you have any white trainers between 845 and 3783 pounds?",
        "language": "en",
        "intent": "SearchProductPriceRange",
        "entities": [
          {
            "category": "Product",
            "offset": 16,
            "length": 14
          },
          {
            "category": "PriceRangeEntity",
            "offset": 39,
            "length": 2
          },
          {
            "category": "PriceRangeEntity",
            "offset": 47,
            "length": 3
          }
        ]
      }

image

It was working fine then all of a sudden started classifying sentences as numbers.

AdamMiltonBarker commented 9 months ago

I noticed in the above that the indexes are not correct.

In utterance `I am looking for black trainers that are between £956 and £2962?`, entity `PriceRangeEntity` with start at index `3` exists outside the utterance length.

Your system has a problem with the GBP currency sign (£). When there is a pound sign in the string, it throws out the indexes, this was why I was getting errors on some of the strings. Stripping out the pound sign fixes the issue completely. No more misclassifications.

This fixes things if there is no ? at the end of the inference string. If there is an ? then it will do this:


entities
: 
Array(2)
0
: 
{category: 'Product', text: 'black trainers', offset: 17, length: 14, confidenceScore: 1}
1
: 
{category: 'PriceRangeEntity', text: '10', offset: 49, length: 2, confidenceScore: 1, …}
length
: 
2

It will completely miss the second number every single time you use a ? at the end of a sentence. Any punctuation at all breaks everything.

AjayBathini-MSFT commented 9 months ago

@AdamMiltonBarker

Thank you for your feedback! I'd recommend working closer with our support team via an [Azure support request] (https://docs.microsoft.com/en-us/azure/azure-portal/supportability/how-to-create-azure-support-request). Or you can leverage our Q&A forum by posting your issue there so our community, and MVPs can further assist you in troubleshooting this issue or finding potential workarounds. [Teams Q&A forum] (https://docs.microsoft.com/en-us/answers/topics/46488/office-teams-windows-itpro.html) for technical questions about the configuration and administration of Microsoft Teams on Windows. [Microsoft Teams Community forum] (https://answers.microsoft.com/en-us/msteams/forum?sort=LastReplyDate&dir=Desc&tab=All&status=all&mod=&modAge=&advFil=&postedAfter=&postedBefore=&threadType=All&isFilterExpanded=false&page=1) Thank you for your time and patience throughout this issue.

AdamMiltonBarker commented 9 months ago

@AjayBathini-MSFT I was just letting you know that your platform breaks if you have any special characters in the utterances. This is the same for £ ' and I am guessing most other characters. I did not see anything in the docs that warned against this so maybe they should be updated? Special character such as £ and ' break data import if you have learned entities and it is very annoying because it changes the offset and length values making them incorrect. You must strip them out on your end which results in the passed offset and length values being incorrect.

AjayBathini-MSFT commented 9 months ago

@AdamMiltonBarker

I have raised a request to update the document to content owner. He will investigate and resolve the issue as required. https://github.com/MicrosoftDocs/azure-docs/issues/119037

Thank you for your inputs and contribution for the document corrections.