Minimum/Maximum values for InitialSilence and EndSilence timeouts for Speech SDK for Python

Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK

MIT License

2.86k stars 1.84k forks source link

Minimum/Maximum values for InitialSilence and EndSilence timeouts for Speech SDK for Python #756

Closed GeisaFaustino closed 4 years ago

GeisaFaustino commented 4 years ago

Hello, I am using Speech SDK for Python version 1.13.0 and I am getting the error: No speech could be recognized: NoMatchDetails(reason=NoMatchReason.InitialSilenceTimeout).

Is there a way of setting silence timeout via Python SDK API?

I only found this availability regarding C# and Java SDKs as mentioned in https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/131 and https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/502, respectively.

nansravn commented 4 years ago

I'm facing the same issue here. Do we have a roadmap for supporting this feature in the Python SDK?

lisaweixu commented 4 years ago

You could set the two parameters via an endpoint as below for SpeechRecognizer: https://**region**.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1? format=simple&initialSilenceTimeoutMs=60000&endSilenceTimeoutMs=60000

Replace region with your own region, and the 60000 with your own value.

Please note that these two parameters change how the speech recognizer behave a lot. It is OK to play with them for a few utterances. But to get the best values for a general usage in real apps, you would need design a general dataset and have them thoroughly tested. In fact, the default values are the ones our speech scientists picked after balancing all needed datasets.

hercule24 commented 4 years ago

Hi @lisaweixu, What should I do in this case? My endpoint on the portal is very different from the one you mentioned above. https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken

BriceChivu commented 2 years ago

Hello, I faced the same issue and I solved it simply by converting my audio sample rate from 48kHz to 16kHz. It worked well after that.

felixcarmona commented 7 months ago

you can use

speech_config.set_property(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "600000")