Azure-Samples / Cognitive-Speech-STT-Windows

Windows SDK for the Microsoft Speech-to-Text API, part of Cognitive Services
https://www.microsoft.com/cognitive-services/en-us/speech-api
Other
112 stars 89 forks source link

C# example does not work #25

Closed bryanjhogan closed 7 years ago

bryanjhogan commented 7 years ago

I'm getting errors while running the WPF application.


--- Start speech recognition using long wav file with LongDictation mode in en-US language ----

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

repeated....

It looks the app is able to upload the wav file, the SendAudioHelper method completes its work.

Here is my config.

Redacted deployment info - image

And my redacted app.config

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5"/>
  </startup>
  <appSettings>
    <add key="luisAppID" value="75..." />
    <add key="luisSubscriptionID" value="50..." />
    <add key="ShortWaveFile" value="whatstheweatherlike.wav" />
    <add key="LongWaveFile" value="batman.wav" />
    <!-- 
     Enter an optional authentication Uri such as: 
     https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken
    -->
    <add key="AuthenticationUri" value="" />
  </appSettings>
</configuration>

And in the app there are repeated errors -

image

priyaravi20 commented 7 years ago

Can you try running the sample as is. I just tested and and was able to get a reco result for both whatstheweatherlike.wav and batman.wav.

bryanjhogan commented 7 years ago

Set app.config back to

  <appSettings>
    <add key="luisAppID" value="yourLuisAppID" />
    <add key="luisSubscriptionID" value="yourLuisSubsrciptionID" />
    <add key="ShortWaveFile" value="whatstheweatherlike.wav" />
    <add key="LongWaveFile" value="batman.wav" />

removed the subscription key from the text box.

Same error -

image

bryanjhogan commented 7 years ago

@priyaravi20 Any suggestions on this?

priyaravi20 commented 7 years ago

Hi Bryan - if u add your bing speech API in primary key field in app.config u should see it in the window above. I verified that endpoint works again.

bryanjhogan commented 7 years ago

Hi @priyaravi20.

From the images I pasted above is the key you are referring to my subscription key, the one that starts with 50...?

You also suggest using the "primary key field in app.config", there is none in the provided app.config. In the source code the only reference to primaryKey is commented out. /// string subscriptionKey = ConfigurationManager.AppSettings["primaryKey"];

priyaravi20 commented 7 years ago

You can do 1 of 2 things - directly add it to UI and use Save Key so it will be remembered or you can add a key to app.config as I mentioned and just uncomment the line above so initialization happens from here and not from Isolated storage as in Intialize method. Add a breakpoint and make sure when u use the method CreateDataRecoClient the subscription key is right. Subscription key is the key you got when you signed up for Bing Speech API in azure following the steps under Subscribe to Speech API and Get a Free Trial Subscription Key in https://docs.microsoft.com/en-us/azure/cognitive-services/speech/getstarted/getstartedcsharpdesktop.

bryanjhogan commented 7 years ago

Thanks for the suggestions @priyaravi20.

I created a new Bing Speech API

image

I put the key into the app

image

And still get the same error.

priyaravi20 commented 7 years ago

Hi Bryan - All your steps seem right. Short of you sharing the key which you can recreate later not sure how else I can help you. I just got a new key and was able to do a successful reco with the sample published.

bryanjhogan commented 7 years ago

Thanks for all your help @priyaravi20.

If that's the right key, then I have no idea what is going wrong.

Can I private message you the key? You can get my email address here - http://nodogmablog.bryanhogan.net/contact/

aramka commented 7 years ago

How was this resolved? I tried the application without making source changes and it didnt work. Should it work without any changes?

priyaravi20 commented 7 years ago

Yes. Please get the latest SDK as we made changes to auth endpoint last year.

aramka commented 7 years ago

I have the latest and I still get the issue. Should the application work from the start?

priyaravi20 commented 7 years ago

Yes. You need the latest sample and SDK and get an azure subscription key using the Getting Started page and you should see reco results for whatstheweatherlike if you use short form file based reco in the sample provided.

bryanjhogan commented 7 years ago

@aramka I had made a mistake in the app.config, pointing it at an old endpoint.

Once I corrected that, I put in the access key from my Bing Speech API - it has to be a Bing Speech API; then it worked for me.

image

aramka commented 7 years ago

@bryanjhogan thanks I will check that out.

Im trying to call my Custom Speech Service endpoint using the speech recognition client from the Microsoft.ProjectOxford.SpeechRecognition package. Yes, I have an azure subscription key and a deployed Custom Speech Service Endpoint. Further, I have tested the endpoint via https://cris.ai/Deployments portal and it works.

According to the article below I should be able to the use the SpeechRecognitionServiceFactory and any of the clients (DataRecognitionClient or MicrophoneRecognitionClient) with a Custom Speech Service endpoint.

Here is the article: https://docs.microsoft.com/en-us/azure/cognitive-services/custom-speech-service/customspeech-how-to-topics/cognitive-services-custom-speech-use-endpoint

In particular it says

The Client Speech SDK provides a factory class SpeechRecognitionServiceFactory, which offers the following methods: CreateDataClient(...): A data recognition client. CreateDataClientWithIntent(...): A data recognition client with intent. CreateMicrophoneClient(...): A microphone recognition client. CreateMicrophoneClientWithIntent(...): A microphone recognition client with intent. For detailed documentation, see the Bing Speech API. The Custom Speech Service endpoints support the same SDK.

Is this true? Can I use the the client library Microsoft.ProjectOxford.SpeechRecognition with a Custom Speech Service Endpoint?

Is the code available for the Microsoft.ProjectOxford.SpeechRecognition package? Github repo?

melvinma commented 7 years ago

@aramka I struggled with this error for a day now and I finally made it work. I am not sure why is that :). But since your setup is very similar to mine, I thought I will just let you know.

1> I went to the Custom Speech Recognition Service and tested it. I am not sure whether that made the difference. 2> I also changed the implementation for the long dictation use my own custom service ``

this.dataClient =  SpeechRecognitionServiceFactory.CreateDataClient(
                    SpeechRecognitionMode.LongDictation,
                    "en-us",
                    this.SubscriptionKey,
                    this.SubscriptionKey,
                    "https://<my custom ARS id>.api.cris.ai/ws/cris/speech/recognize/continuous");

I got this code from the URL you mentioned (https://docs.microsoft.com/en-us/azure/cognitive-services/custom-speech-service/customspeech-how-to-topics/cognitive-services-custom-speech-use-endpoint). 3> app config
<add key="AuthenticationUri" value="https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken" />

aramka commented 7 years ago

Thanks everyone for the help.

I got this working. Here is my code: ` class Program { static void Main(string[] args) { DataRecognitionClient dataClient; MicrophoneRecognitionClient micClient; string yourSubscriptionId = "this Is Your SubscriptionId From The Deployments Section On Your Custom Speech Portal Deployments Section_crisAiDeploymentsDetails"; string endPointUrlShort = "this Is Your Web Socket for Short Phrase mode From The Deployments Section On Your Custom Speech Portal DeploymentsSection_crisAiDeploymentsDetails"; string shortWavFileName = @"whatstheweatherlike.wav";

        dataClient = dataClient = SpeechRecognitionServiceFactory.CreateDataClient(
            SpeechRecognitionMode.ShortPhrase,
            "en -us",yourSubscriptionId,
            yourSubscriptionId,endPointUrlShort);
        // set the authorization Uri
        dataClient.AuthenticationUri = "https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken";

        dataClient.OnResponseReceived += OnDataShortPhraseResponseReceivedHandler;
        dataClient.OnResponseReceived += OnDataDictationResponseReceivedHandler;

        dataClient.OnPartialResponseReceived += OnPartialResponseReceivedHandler;
        dataClient.OnConversationError += OnConversationErrorHandler;

        Program.SendAudioHelper(shortWavFileName, dataClient);

    }
    static void SendAudioHelper(string wavFileName, DataRecognitionClient dataClient)
    {
        using (FileStream fileStream = new FileStream(wavFileName, FileMode.Open, FileAccess.Read))
        {
            // Note for wave files, we can just send data from the file right to the server.
            // In the case you are not an audio file in wave format, and instead you have just
            // raw data (for example audio coming over bluetooth), then before sending up any 
            // audio data, you must first send up an SpeechAudioFormat descriptor to describe 
            // the layout and format of your raw audio data via DataRecognitionClient's sendAudioFormat() method.
            int bytesRead = 0;
            byte[] buffer = new byte[1024];

            try
            {
                do
                {
                    // Get more Audio data to send into byte buffer.
                    bytesRead = fileStream.Read(buffer, 0, buffer.Length);

                    // Send of audio data to service. 
                    dataClient.SendAudio(buffer, bytesRead);
                }
                while (bytesRead > 0);
            }
            finally
            {
                // We are done sending audio.  Final recognition results will arrive in OnResponseReceived event call.
                dataClient.EndAudio();
            }
        }
    }
    private static void OnDataShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e)
    {
        WriteLine("OnDataShortPhraseResponseReceivedHandler");
        WriteSpeechResponseEvent(e);
    }

    private static void WriteSpeechResponseEvent(SpeechResponseEventArgs e)
    {
        WriteLine("e.RecognitionStatus:" + e.PhraseResponse.RecognitionStatus + ", e.PhraseResponse.Results.Length: " + e.PhraseResponse.Results.Length);
        for (int i = 0; i < e.PhraseResponse.Results.Length; i++)
        {
            WriteLine("e.PhraseResponse.Results[{0}].DisplayText: {1}", i, e.PhraseResponse.Results[i].DisplayText);

            WriteLine("e.PhraseResponse.Results[{0}].InverseTextNormalizationResult: {1}", i, e.PhraseResponse.Results[i].InverseTextNormalizationResult);
            WriteLine("e.PhraseResponse.Results[{0}].LexicalForm: {1}", i, e.PhraseResponse.Results[i].LexicalForm);

            WriteLine("e.PhraseResponse.Results[{0}].MaskedInverseTextNormalizationResult: {1}", i, e.PhraseResponse.Results[i].MaskedInverseTextNormalizationResult);

        }
    }

    private static void OnDataDictationResponseReceivedHandler(object sender, SpeechResponseEventArgs e)
    {
        WriteLine("OnDataDictationResponseReceivedHandler");
        WriteSpeechResponseEvent(e);
    }

    private static void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e)
    {
        WriteLine("OnPartialResponseReceivedHandler: e.PartialResult:" + e.PartialResult);
    }

    private static void OnConversationErrorHandler(object sender, SpeechErrorEventArgs e)
    {
        WriteLine("OnConversationErrorHanlder: e.SpeechErrorCode:" + e.SpeechErrorCode + ", e.SpeechErrorText:" + e.SpeechErrorText);
    }

    private static void WriteLine(string format, params object[] args)
    {
        var formattedStr = string.Format(format, args);
        Trace.WriteLine(formattedStr);
        Console.WriteLine(formattedStr);
    }
}`
cidrugHug8 commented 6 years ago

so, what's KEY2??? where do i use KEY2?

zhouwangzw commented 6 years ago

You can use either key1 or key2 as subscription key, as described here

To use Speech Service, you must first subscribe to the Speech API that's part of Cognitive Services (previously Project Oxford). You can get free trial subscription keys from the Cognitive Services subscription page. After you select the Speech API, select Get API Key to get the key. It returns a primary and secondary key. Both keys are tied to the same quota, so you can use either key.

maoyujiao commented 6 years ago
inahmartins commented 6 years ago

I did with Bing Speech and it worked just fine! The others doesn't work!

bugproof commented 5 years ago

Waste of time:

--- Start speech recognition using short wav file with ShortPhrase mode in en-US language ----

--- Error received by OnConversationErrorHandler() ---
Error code: ConnectionFailed
Error text: Transport error

Pasted API key for Bing Speech API

zhouwangzw commented 5 years ago

We have released new Speech SDK that connects to the new Speech Service with more features and customization support. Pleaes check here for details. The Bing Speech API will be deprated in future.

bugproof commented 5 years ago

@zhouwangzw Thanks. I will check it out.

arcbus commented 5 years ago

I'm having this same problem now. Fresh download of sample for dotnetcore console, using region 'westus' and have tried my subscription ID as well as Key1 and Key2. I've also tried explicitly setting the endpoint with: config.EndpointId = "https://westus.api.cognitive.microsoft.com/luis/v2.0";

But nothing seems to help.

CANCELED: Reason=Error CANCELED: ErrorCode=ConnectionFailure CANCELED: ErrorDetails=Connection failed (no connection to the remote host). Internal error: 8. Error details: 998. Plea se check network connection, firewall setting, and the region name used to create speech factory. CANCELED: Did you update the subscription info?

Not sure what to do now. Any ideas?

arcbus commented 5 years ago

Huh, just stumbled on this thread:

https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/138

@enelguo Do you see the same problem when running your application on Windows 10, instead of Win7? The Speech SDK is tested on Windows 10. Win7 is not tested nor supported.

So apparently it might work on some linux flavors, but not on a still supported windows flavor. lovely...