Closed Snowdar closed 6 months ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity. Thank you for your contributions.
NoPause synthesiser: https://nopause.io/ Python SDK: https://github.com/NoPause-io/nopause-python We have already submitted an MR: https://github.com/vocodedev/vocode-python/pull/361.
In previous Text-To-Speech (TTS) modes, such as when feeding inputs to the synthesiser from ChatGPT, it typically necessitates waiting for the full sentence to ensure accurate synthesis. However, in the dual-stream mode, we permit immediate input of ChatGPT's token output into the TTS system, (supported on both character and sentence-level inputs), and simultaneously return streaming synthesis outputs for real-time playback. This approach not only maintains quality but also reduces the delay resulting from waiting for the complete sentences.
On vocode, we have not created a new concept called DualStreamConversation; instead, we have implemented dual-stream based on the existing StreamConversation architecture and have provided a usage example in the quickstart. Currently, we haven't added any non-dual-stream codes, but it's straightforward to implement, and we plan to continually update it, including turn-based codes.
Considering the differences between dual-stream and previous modes, it may cause compatibility issues within the framework, so we hope that the author of vocode can review our contribution and provide some feedback. Thanks.