Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.95k stars 1.86k forks source link

Sentiment is null #590

Closed BadFame closed 4 years ago

BadFame commented 4 years ago

Describe the bug Today (04/17/2020 09:33:41), we received a notification from our automated job that Sentiment is not being provided even when it is explicitly being requested as part of the batch speech-to-text service configurations.

The sentiment property is set to null while retrieving the transcription results (SegmentResults > NBest > Sentiment) from the service. This sudden failure of Sentiment being null caused our job to error back to the web-hook (making the webhook deactivated which is expected).

To Reproduce

  1. This is what we always have sent to the “speech-to-text” service since day 1:

                            "recordingsUrls": [recording_download_urls],
                            "models": [],
                            "locale": "en-US",
                            "name": CALL_ID_NAME,
                            "description": CALL_ID_DESCRIPTION,
                            "properties": {
                                        "ProfanityFilterMode": "Masked",
                                        "PunctuationMode": "DictatedAndAutomatic",
                                        "AddDiarization": "True",
                                        "AddWordLevelTimestamps" : "True",
                                        **"AddSentiment": "True"**
                                }
  2. Webhook calls service once a transcription is completed with the following including the transcription results:

"reportFileUrl": "OMMITTED FOR THIS ISSUE",
    "statusMessage": "None.",
    "lastActionDateTime": "2020-04-17T11:09:02Z",
    "status": "Succeeded",
    "id": "OMITTED FOR THIS ISSUE",
    "createdDateTime": "2020-04-17T11:07:17Z",
    "locale": "en-US",
    "name": "OMITTED FOR THIS ISSUE",
    "description": "OMITTED FOR THIS ISSUE",
    "properties": {
      "ProfanityFilterMode": "Masked",
      "PunctuationMode": "DictatedAndAutomatic",
      "AddDiarization": "True",
      "AddWordLevelTimestamps": "True",
      **"AddSentiment": "True",**
      "CustomPronunciation": "False",
      "Duration": "00:03:24"
  1. However, once we try to obtain the transcription results, the sentiment was null.
    "SegmentResults": [
        {
          "RecognitionStatus": "Success",
          "ChannelNumber": "0",
          "SpeakerId": null,
          "Offset": 560600000,
          "Duration": 3900000,
          "OffsetInSeconds": 56.06,
          "DurationInSeconds": 0.39,
          "NBest": [
            {
              "Confidence": 0.3091757,
              "Lexical": "yeah",
              "ITN": "yeah",
              "MaskedITN": "yeah",
              "Display": "Yeah.",
              "Sentiment": null,
              "Words": [
                {
                  "Word": "yeah",
                  "Offset": 561100000,
                  "Duration": 3200000,
                  "OffsetInSeconds": 56.11,
                  "DurationInSeconds": 0.32,
                  "Confidence": 0.781588
                }
              ]
            }
          ]
        },… More items omitted for simplicity.

Expected behavior The expected behavior is that sentiment should be provided as part of the batch speech-to-text transcription service if it is requested according to the following doc: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription#sentiment-analysis

Version of the Cognitive Services Speech SDK Unfortunately, we are using version 2.1 because there is no documentation for a newer version. In fact, we were using the same version as the one denoted on the document above. Until we submitted a GitHub issue and we were told that we needed to use 2.1 instead of the documentation 2.0...

Platform, Operating System, and Programming Language Using Azure Function to call the batch text-to-speech service.

Additional context N/A

zhouwangzw commented 4 years ago

Could you please let us know which region are you using? I tried WestUs and EastUS2, and the sentiment results are available in the transcription results, e.g.

               "Sentiment": {
                "Negative": 0.0,
                "Neutral": 0.921228,
                "Positive": 0.0
              },

For investigation, please also send us the "id" in the GET response message, like

  "status": "Succeeded",
  "id": "fea3f2d0-7fdf-4e20-b500-6b7d45c170a9",
  "createdDateTime": "2020-04-17T19:45:54Z",

Thanks, Zhou

BadFame commented 4 years ago

That is interesting. I have little over 300 transcriptions on which all of them are returning null. I am using eastus2.

 "status": "Succeeded",
    "id": "be679a95-f4d4-4037-895d-47ded1ebd2f7",
    "createdDateTime": "2020-04-17T11:29:47Z",
"properties": {
    "ProfanityFilterMode": "Masked",
    "PunctuationMode": "DictatedAndAutomatic",
    "AddDiarization": "True",
    "AddWordLevelTimestamps": "True",
    "AddSentiment": "True",
    "CustomPronunciation": "False",
    "Duration": "00:00:26"
  }
"NBest": [
            {
              "Confidence": 0.638227,
              "Lexical": "i hate the computer at this time he tasted worse",
              "ITN": "i hate the computer at this time he tasted worse",
              "MaskedITN": "i hate the computer at this time he tasted worse",
              "Display": "I hate the computer at this time, he tasted worse.",
              "Sentiment": null,
              "Words": [
                {
                  "Word": "i",
                  "Offset": 46900000,
                  "Duration": 1400000,
                  "OffsetInSeconds": 4.69,
                  "DurationInSeconds": 0.14,
                  "Confidence": 0.884177
                },
                {
                  "Word": "hate",
                  "Offset": 48300000,
                  "Duration": 5400000,
                  "OffsetInSeconds": 4.83,
                  "DurationInSeconds": 0.54,
                  "Confidence": 0.946418
                }
BadFame commented 4 years ago

How strange... I submitted a new transcription request today:

"status": "Succeeded",
  "id": "28951f69-c867-4128-a496-33c5c08bc077",
  "createdDateTime": "2020-04-20T18:56:44Z"

And I got sentiment back:

"NBest": [
            {
              "Confidence": 0.8827704,
              "Lexical": "and uh",
              "ITN": "and uh",
              "MaskedITN": "and uh",
              "Display": "And, uh",
              "Sentiment": {
                "Negative": 0.316903,
                "Neutral": 0.683078,
                "Positive": 0.0
              },
              "Words": [
                {
                  "Word": "and",
                  "Offset": 594500000,
                  "Duration": 4200000,
                  "OffsetInSeconds": 59.45,
                  "DurationInSeconds": 0.42,
                  "Confidence": 0.980738
                },
                {
                  "Word": "uh",
                  "Offset": 598700000,
                  "Duration": 2800000,
                  "OffsetInSeconds": 59.87,
                  "DurationInSeconds": 0.28,
                  "Confidence": 0.949764
                }

Does this mean it probably was a fluke... should I have to re-request the 300 transcriptions for that day to obtain their sentiment?

It is the first time an instance like this one has happened in 5 months of running a production product against the Speech to text batch API.

Unless I hear otherwise, I will reenable the web-hook and my automated product to continue transcribing. We will continue to monitor the behavior of the API and will report any anomaly.

All the best

zhouwangzw commented 4 years ago

Great to know that it works. We have been investigating but we have not found any reason so far, since we cannot really repro the issue. Please let us know if the problem occurs again. Thanks!