Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.96k stars 1.86k forks source link

How to use custom speech model with real-time universal speech translation API v2 #2670

Open martinflorek opened 3 days ago

martinflorek commented 3 days ago

Is your feature request related to a problem? Please describe. The custom speech's documentation says that "With custom speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for real-time speech to text, speech translation, and batch transcription.". Emphasis mine on the can be used with speech translation.

But I am unable to find any example of how to do it. Ideally with the new universal speech API v2's endpoint wss://<region>.stt.speech.microsoft.com/speech/universal/v2.

Describe the solution you'd like I would like to have an example of using a custom speech model with the new universal speech API v2's real-time speech translation API. Ideally also a quick testing UI in the Speech Studio https://speech.microsoft.com/portal

Describe alternatives you've considered I have tried setting the EndpointID/CustomSpeechDeploymentId in the JavaScript SDK with my deployed custom model's Endpoint ID, which correctly set thecid query paremter on the URL, and there was no change in the transcription results. Also the downloaded model logs are empty (I have enabled logging when I deployed the model). So my custom model is not being used at all.

Am I doing something wron, or it is not possible to use custom speech models with the speech translation or with speech translation in the new universal speech API v2? Thank you.