Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.68k stars 1.79k forks source link

Speech marks duration possible issue when I use break between two sentences #2318

Open coder-kl opened 3 months ago

coder-kl commented 3 months ago

I am experiencing an interesting issue.

My text highlighting feature with speech mark created using word boundary method work perfectly for one or more sentences. However, when I add between two sentences to increase pause between two sentences, the same logic work correctly for the first sentence correctly, however it throws off audio and text highlighting.

Has anyone encountered such issue?

This tag generates correct speech marks for text highlighting:

<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice name="hi-IN-SwaraNeural"><prosody rate='-30%'>My name is Ramesh. My name is Ramesh.</prosody><mstts:silence type="Tailing" value="0"/></voice></speak>

This tag produce correct speech marks for only the first sentence, and speech marks seems off for the second sentence. <speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice name="hi-IN-SwaraNeural"><prosody rate='-30%'>My name is Ramesh.<break time='1s'/>My name is Ramesh.<break time='1s'/></prosody><mstts:silence type="Tailing" value="0"/></voice></speak>

Any feedback would be helpful.

yulin-li commented 3 months ago

@Kerry-LinZhang could you help to triage?

Kerry-LinZhang commented 3 months ago

Hi @coder-kl thanks and well received for the feedback, let me track it and keep you updated for the progress.

Kerry-LinZhang commented 3 months ago

Under investigation for it

Kerry-LinZhang commented 2 months ago

Investigation ongoing

github-actions[bot] commented 1 month ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

coder-kl commented 1 month ago

@Kerry-LinZhang - Another behavior I have noticed recently with the latest SDK is that the speech marks do not generate sometimes. Alternatively, sometimes it returns a partial list. However, if we run the same function multiple times, it eventually returns all speech marks correctly. Do you aware of such issue? It is a random behavior, so I cannot pint point exact issue. Just sharing it if anyone has noticed similar issue.

Kerry-LinZhang commented 1 month ago

Hi @coder-kl thanks for your feedback, we are under investigation for it. I will continue tracking this feedback.

coder-kl commented 1 month ago

Thank you @Kerry-LinZhang

coder-kl commented 1 month ago

I accidentally clicked on the wrong button.

github-actions[bot] commented 4 weeks ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

Kerry-LinZhang commented 3 weeks ago

Assign @yanchang-gyc to continue following up on the feedback