Open nisalr opened 4 weeks ago
Sometimes Silero detects that the user is speaking when Deepgram doesn't (and vice versa). Is there a way to just use Deepgram endpointing so that these inconsistencies don't occur?
looks like we need to push UserStartedSpeaking and UserStoppedSpeakingFrame from stt layer and disable the vad in transport layer
UserStartedSpeaking
UserStoppedSpeakingFrame
stt
vad
Sometimes Silero detects that the user is speaking when Deepgram doesn't (and vice versa). Is there a way to just use Deepgram endpointing so that these inconsistencies don't occur?