[Text analytics] Warning about document length inconsistent with API service documentation

justinqquall commented 1 year ago

Package Name: azure-ai-textanalytics
Package Version: 5.2.1
Operating System:
Python Version: 3.10
Describe the bug

The text analytics endpoint indicates that the max size per document is 30,720 characters however when submitting a document considerably smaller, a warning is received despite smaller character count. See below traceback printing the value of the document and warning for the same AnalyzeHealthcareEntitiesResult object. See data limits documentation here: https://learn.microsoft.com/en-us/azure/cognitive-services/language-service/concepts/data-limits#maximum-characters-per-document

To Reproduce Steps to reproduce the behavior:

Submit a document above 8000 characters to text analytics API

Expected behavior No warning is received.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context This happens both for the default text analytics model and when using model_version 2022-08-15-preview learn.microsoft.comlearn.microsoft.com Data limits for Language service features - Azure Cognitive Services Data and service limitations for Azure Cognitive Service for Language features.

azure-sdk commented 1 year ago

Label prediction was below confidence level 0.6 for Model:ServiceLabels: 'Cognitive - Text Analytics:0.54613066,Docs:0.21392056,Cognitive Services:0.044528954'

azure-sdk commented 1 year ago

Label prediction was below confidence level 0.6 for Model:ServiceLabels: 'Cognitive - Text Analytics:0.54613066,Docs:0.21392056,Cognitive Services:0.044528954'

kristapratico commented 1 year ago

Hey @justinqquall, my understanding is that the 30k char limit is the max you can send in a request, any more than that and the request will fail. From your screenshot, the request succeeds, but there is a warning. Warnings are usually returned to indicate that the quality of the model prediction may be affected due to some reason. @peytonfraser from the Language service team to confirm.

kristapratico commented 1 year ago

Adding @aurghob to confirm.

aurghob commented 1 year ago

Hi, Apologies for the delayed reply. We have this task in our backlog and will prioritize accordingly.

Azure / azure-sdk-for-python

[Text analytics] Warning about document length inconsistent with API service documentation #27407