Closed pankajkumar closed 4 months ago
I have tried to debug this a bit further. I think its because of this code: https://github.com/vocodedev/vocode-core/blob/main/vocode/streaming/streaming_conversation.py#L494
maybe_synthesis_result is None if its not first token, hence only the first token is getting added into transcript.
thanks for the report! is this on latest master or on vocode 0.1.113?
I am using vocode 0.1.113
Found the root cause. This issue seems to be fixed in following commit: https://github.com/vocodedev/vocode-core/commit/60d21879d76e21a85bd25f9e607f3322eb3d3c45#diff-b086df98bf635ee6a71718985b64311e895fb6a750e1b83f633b6c39baed34c9L958
Awesome! Yes - the ElevenLabs websocket synthesizer had some problems with interruptions / transcripts which recently got fixed in the output device refactor. We're looking to push this to 0.1.114 soon, but the interim we have it in prerelease: https://pypi.org/project/vocode/0.1.114a0/
Brief Description
Hi Team,
I have recently upgraded to vocode latest version but I am facing a issue where bot transcriptions are incomplete, it only shows the first word of the whole sentence in the transcript. Human transcripts are seems to be fine at other end.
I am using deepgram as transcriber, gpt-4o for agent and elevenlabs websocket synthesizer for TTS.
for example:
BOT: Hello HUMAN: Hello sir I am xxx person from yyy company BOT: Hello xxx, you are talking from yyy, Can I know - HUMAN: Sir for the whole day hello hello BOT: yes HUMAN: Hello. Sir is there any requirements for cargo and courier services ? BOT: Right now HUMAN: Sorry? BOT: Any HUMAN: No sir, thank you. BOT: okay
And the actual conversation after transcribing the recording with deepgram using same model is given below, differences are highlighted in the bold. I have replaced the actual names with xxx, yyy and zzz.
BOT: Hello HUMAN: Hello sir I am xxx person from (yyy company) BOT: Hello (xxx person), you are talking from (yyy company), Can I know - HUMAN: Sir for the whole day hello hello BOT: yes, tell me xxx, what you want to talk about HUMAN: Hello. Sir is there any requirements for cargo and courier services ? BOT: Right now (zzz person) is busy, do you want to leave a message HUMAN: Sorry? BOT: Any message you want to give to (zzz person) HUMAN: No sir, thank you. BOT: okay, thank you
LLM
ChatGPT
Transcription Services
Deepgram
I have used hi-IN language model from deepgram nova-2 general model
Synthesis Services
Eleven Labs
Telephony Services
None
Conversation Type and Platform
No response
Steps to Reproduce
The issue is coming whenever there is interruption in the conversation by the caller, the bot transcript is showing incomplete.
Expected Behavior
The bot transcript should be complete.
Screenshots
c4922829-9461-4ebd-bc77-a9a80d92622b_r.zip
Please find the attached conversation recording for your reference.