API: Batch Speech-to-Text - Sentiment is null

BadFame commented 4 years ago

Describe the bug Today (04/17/2020 09:33:41), we received a notification from our automated job that Sentiment is not being provided even when it is explicitly being requested as part of the batch speech-to-text service configurations.

The sentiment property is set to null while retrieving the transcription results (SegmentResults > NBest > Sentiment) from the service. This sudden failure of Sentiment being null caused our job to error back to the web-hook (making the webhook deactivated which is expected).

To Reproduce

This is what we always have sent to the “speech-to-text” service since day 1:

                         "recordingsUrls": [recording_download_urls],
                         "models": [],
                         "locale": "en-US",
                         "name": CALL_ID_NAME,
                         "description": CALL_ID_DESCRIPTION,
                         "properties": {
                                     "ProfanityFilterMode": "Masked",
                                     "PunctuationMode": "DictatedAndAutomatic",
                                     "AddDiarization": "True",
                                     "AddWordLevelTimestamps" : "True",
                                     **"AddSentiment": "True"**
                             }

2.Webhook calls service once a transcription is completed with the following including the transcription results:

"reportFileUrl": "OMMITTED FOR THIS ISSUE",
    "statusMessage": "None.",
    "lastActionDateTime": "2020-04-17T11:09:02Z",
    "status": "Succeeded",
    "id": "OMITTED FOR THIS ISSUE",
    "createdDateTime": "2020-04-17T11:07:17Z",
    "locale": "en-US",
    "name": "OMITTED FOR THIS ISSUE",
    "description": "OMITTED FOR THIS ISSUE",
    "properties": {
      "ProfanityFilterMode": "Masked",
      "PunctuationMode": "DictatedAndAutomatic",
      "AddDiarization": "True",
      "AddWordLevelTimestamps": "True",
      **"AddSentiment": "True",**
      "CustomPronunciation": "False",
      "Duration": "00:03:24"

However, once we try to obtain the transcription results, the sentiment was null.

"SegmentResults": [
    {
      "RecognitionStatus": "Success",
      "ChannelNumber": "0",
      "SpeakerId": null,
      "Offset": 560600000,
      "Duration": 3900000,
      "OffsetInSeconds": 56.06,
      "DurationInSeconds": 0.39,
      "NBest": [
        {
          "Confidence": 0.3091757,
          "Lexical": "yeah",
          "ITN": "yeah",
          "MaskedITN": "yeah",
          "Display": "Yeah.",
          "Sentiment": null,
          "Words": [
            {
              "Word": "yeah",
              "Offset": 561100000,
              "Duration": 3200000,
              "OffsetInSeconds": 56.11,
              "DurationInSeconds": 0.32,
              "Confidence": 0.781588
            }
          ]
        }
      ]
    },… More items omitted for simplicity.

Expected behavior The expected behavior is that sentiment should be provided as part of the batch speech-to-text transcription service if it is requested according to the following doc: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription#sentiment-analysis

Version of the Cognitive Services Speech SDK Unfortunately, we are using version 2.1 because there is no documentation for a newer version. In fact, we were using the same version as the one denoted on the document above. Until we submitted a GitHub issue and we were told that we needed to use 2.1 instead of the documentation 2.0...

Platform, Operating System, and Programming Language Using Azure Function to call the batch text-to-speech service.

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 73fb7e76-fb80-c420-2efe-e3ca3a27b749
Version Independent ID: 07a42557-deb3-e4b3-3f14-f26a27be49c2
Content: What is batch transcription - Speech service - Azure Cognitive Services
Content Source: articles/cognitive-services/Speech-Service/batch-transcription.md
Service: cognitive-services
Sub-service: speech-service
GitHub Login: @wolfma61
Microsoft Alias: wolfma

GiftA-MSFT commented 4 years ago

@BadFame we will review your feedback and get back to you shortly. Thanks.

wolfma61 commented 4 years ago

@BadFame - which region are you using?

BadFame commented 4 years ago

I have little over 300 transcriptions on which all of them are returning null. I am using eastus2.

"status": "Succeeded",
    "id": "be679a95-f4d4-4037-895d-47ded1ebd2f7",
    "createdDateTime": "2020-04-17T11:29:47Z",

"properties": {
    "ProfanityFilterMode": "Masked",
    "PunctuationMode": "DictatedAndAutomatic",
    "AddDiarization": "True",
    "AddWordLevelTimestamps": "True",
    "AddSentiment": "True",
    "CustomPronunciation": "False",
    "Duration": "00:00:26"
  }

"NBest": [
            {
              "Confidence": 0.638227,
              "Lexical": "i hate the computer at this time he tasted worse",
              "ITN": "i hate the computer at this time he tasted worse",
              "MaskedITN": "i hate the computer at this time he tasted worse",
              "Display": "I hate the computer at this time, he tasted worse.",
              "Sentiment": null,
              "Words": [
                {
                  "Word": "i",
                  "Offset": 46900000,
                  "Duration": 1400000,
                  "OffsetInSeconds": 4.69,
                  "DurationInSeconds": 0.14,
                  "Confidence": 0.884177
                },
                {
                  "Word": "hate",
                  "Offset": 48300000,
                  "Duration": 5400000,
                  "OffsetInSeconds": 4.83,
                  "DurationInSeconds": 0.54,
                  "Confidence": 0.946418
                }

BadFame commented 4 years ago

How strange... I submitted a new transcription request today:

"status": "Succeeded",
  "id": "28951f69-c867-4128-a496-33c5c08bc077",
  "createdDateTime": "2020-04-20T18:56:44Z"

And I got sentiment back:

"NBest": [
            {
              "Confidence": 0.8827704,
              "Lexical": "and uh",
              "ITN": "and uh",
              "MaskedITN": "and uh",
              "Display": "And, uh",
              "Sentiment": {
                "Negative": 0.316903,
                "Neutral": 0.683078,
                "Positive": 0.0
              },
              "Words": [
                {
                  "Word": "and",
                  "Offset": 594500000,
                  "Duration": 4200000,
                  "OffsetInSeconds": 59.45,
                  "DurationInSeconds": 0.42,
                  "Confidence": 0.980738
                },
                {
                  "Word": "uh",
                  "Offset": 598700000,
                  "Duration": 2800000,
                  "OffsetInSeconds": 59.87,
                  "DurationInSeconds": 0.28,
                  "Confidence": 0.949764
                }

Does this mean it probably was a fluke... should I have to re-request the 300 transcriptions for that day to obtain their sentiment?

It is the first time an instance like this one has happened in 5 months of running a production product against the Speech to text batch API.

Unless I hear otherwise, I will reenable the web-hook and my automated product to continue transcribing. We will continue to monitor the behavior of the API and will report any anomaly.

All the best

MicrosoftDocs / azure-docs

API: Batch Speech-to-Text - Sentiment is null #52689

Document Details