Closed BryanDollery closed 3 years ago
Sorry -- this wasn't a bug -- it was my fault entirely. The problem was that in trying to understand the SDK I had stringified the first parameter of the recognized()
event handler, documented simply as s
. It turns out that this is the recognizer itself and stringifying it causes a silent error. As this was the first line of the event-handler it was simply failing silently. I have fixed my code and I'm moving on.
I'm writing yet-another speech transcription/translation app in javascript a browser. I have imported the "microsoft-cognitiveservices-speech-sdk" (v1.14.1) to achieve this and correctly configured the speech API in my Azure Resource Group along with the text translation cognitive services. So good so far. I have translation working like a dream. Transcription (speech-to-text, or STT) has been more difficult, but I'm finally getting there.
However, I have a problem with it. I am using the continuous recognition method directly from the default microphone, initialising the process with the
startContinuousRecognitionAsync()
method. Whilst therecognizing
method is invoked during the 'session' I have found that therecognized
event doesn't fire for me.The logs are filled with unique search strings to help me search for the code that generated them, e.g. 'TTT'.
I won't bore you with the code needed to invoke this object -- it sits in a react app with all sorts of complexities. It basically calls
speech.register()
once, then when the user wants to start STT it callsspeech.rec()
, followed sometime later byspeech.stop()
. I expectrecognizing()
to be invoked during the session andrecognized()
(thedone()
method in my object) to be called when the operation has completed becausespeech.recognizer.stopContinuousRecognitionAsync()
has been invoked. I have also registered an error handler, that's not being invoked either but that is not important for this conversation. I have paused my code in the Chrome debugger and confirmed that my handlers are properly registered with the recognizer.This sort-of works. I see the results of the
recognizing
method (and they're quite impressive -- even transcribing Jabberwocky quite well). But I really need to hook into the final event when the recording is stopped and the full text becomes available. It would be a lot of hard work to use the output of the partial in-progress transcription event (recognizing()
). I expect therecognized()
event handler to be invoked with the full contents of the transcription from start to finish, but it isn't being invoked at all.Any help would be great. I have a deadline for completion of the project in a fortnight and this is the final part to get working. The translation works great and the transcription seems really promising. Thanks, Bryan.