Closed pierat closed 2 years ago
Seems to be a bug because the code is really simple and exactly conforming to all samples found in the documentation. Is this feature really implemented in Java ?
Hi @pierat , I can repro this bug, and forwarded this to service guy. I will let you know if I get any updates.
I am also experiencing this bug from .NET.
Just synced with service guy, the ETA to fix this issue is end of Nov. Thanks!
Experiencing the same bug with NodeJS SDK. Will the fix also affect NodeJS SDK or do I need to create the issue in https://github.com/microsoft/cognitive-services-speech-sdk-js ?
Using the C++ SDK, I am experiencing the same issue. Using the Speech SDK 1.19.0: 2021-Nov release, which is the most recent as of this message. If it would be preferable for me to open a new issue, please let me know. But this appears to be service wide rather than specific to a language SDK.
An update on my findings: If you enable viseme generation for a voice (I'm testing with US English) then the bookmark timings are correct. Without generated viseme timings, the bookmark timings are incorrect.
@yulin-li Is the service issue possibly still in effect? Can we assign this to someone in the service team?
@pankopon as far as I know, this is in backlog, I will check status. I know who is the owner but I don't know his GitHub handle.
Assign to the owner @newhillchan
I'm getting a similar error using javascript: the events are raised at the very beginning of the synthesis. Audio offsets are different but still wrong. Are there any update?
@yulin-li Hi, I'm back on this issue : is there a planning for this bug to be solved ? Should really be useful for me to know if we need to continue with manual workaround or if we can plane development using this feature.
Thanks !
Hi @pierat, thanks for you patience and I just confirmed with the service engineers, and the ETA of this issue is 5/31.
Hello @yulin-li are there any updates about this feature? Thanks!
I tested this today with the exact code from https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/1245#issue-988451704 and Speech SDK 1.22.0, using northeurope
region. The resulting audio offsets match the generated audio (if it's written to a file and viewed e.g. in Audacity):
Bookmark flower_1 reached. Audio offset: 737ms.
Bookmark flower_2 reached. Audio offset: 1250ms.
So the issue seems to be fixed by now. To be closed if there are no other pending items soon.
Closed as resolved, please open a new issue if further support is needed.
Hi,
In a java application, I try to use bookmarks for evaluating audio offsets in a text-to-speech conversion and even the sample code from the tts documentation is giving false results.
Any idea on what is the problem in my coding or a limitation that applies ?
Here is the code :
And here is the result :
Which is not the expected result.
My configuration for this test : Windows 10 with java jdk 1.8.0_301.