Open wxbool opened 8 months ago
@wxbool can you please enable Speech SDK logs, do a single run and share the log here? Thanks! https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-logging . The log will confirm if word timing is part of the JSON recognition result web socket message sent from the service. If so, you will need to access the raw JSON string and parse it yourself to get the word timing from it, as it does not look like we have it exposed in the result object. I'll try to get more info on how to do that.
@wxbool this is an example of how you would do it in Java. I'm trying to see if something similar can be done in GO. https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/java/jre/console/src/com/microsoft/cognitiveservices/speech/samples/console/SpeechRecognitionSamples.java#L122
@wxbool please do this to get the JSON string from the recognition result object: result.Properties.GetProperty(common.SpeechServiceResponseJSONResult, "")
Let me know if that worked for you and you see the word-level timing there.
I'm experimenting with real-time speech recognition using go sdk, tested the basic example, and I'm wondering how to receive word timestamp information for real-time recognized sentences? I found a config.RequestWordLevelTimestamps() enable option in the SDK, but I don't receive the word timestamps in the Recognizing / Recognized event, only the sentence recognition results.
The code is as follows: