-
Hi, first of all, thank you very much for your efforts to provide us with such a great library. One of my needs during the use is whether the detection can be paused? Based on this scenario, I change …
-
The logs of the Multimodal agent with OpenAI Realtime API show correct English text on the console but on the UI, it sometimes shows the audio transcription in other languages like Hindi, Chinese, Rus…
-
Code:
```py
def align(self, video_path: Path, transcript_path: Path):
subtitle_path = video_path.with_suffix(".srt")
command = [
"align",
f"\"{video_path}\"",
f"\"{transcript_path}…
-
It appears that Stephanie isn't able to reliably detect audio after an extended period of time, and intermittently wakes up again, beeps for a voice prompt, and then doesn't detect audio properly. `al…
-
### Feature Description
Love to see how AI SDK can handle Text to Speech from OpenAI. As I see from documentation, TTS can be streamed.
https://platform.openai.com/docs/guides/text-to-speech/strea…
-
Could wire up speech recognition on the audio chunks to:
- auto-name the clips when importing
- show the text an the audio block in the timeline
Bonus: add keyword / topic extraction.
-
Dear author, thanks you for your amazing work. I have manged to try run the code but I don't see any example or audio demo. Could you update the README with processing on real audio files?
-
Very nice library, thank you so much for all the efforts!
For my project, I would need a confidence score not only on the final result of f.e. a complete sentence, but on a per-word basis. Currentl…
-
**IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:**
- Speech SDK log taken from a run that exhibits the reported issue.
[speech_log.log](https://github.com/Azure-Samples/cognitive-service…
-
Is there a way to by-pass the wake word and start STT on the requested ? I have face recognition system done with Frigate/Doube-Take/Compreface/Home Assistant. I have connected with Node-red event lis…