Closed sgodin closed 3 weeks ago
Hi @sgodin
I will take a look at this. I noticed this was missing in the last release.
In the other SDK, we send the CloseStream
message on behalf of the user for convenience, and we introduce a small delay for any final messages to arrive on the client side. This usually works for 99% of customers.
For the 1% of customers this does not work for (latency, etc.), we recommend manually (i.e., you send this on your own) using the send function and then waiting for the appropriate amount of time as determined by your needs. In the meantime, this is how you can do this today. You can do this by calling this function: https://github.com/deepgram/deepgram-dotnet-sdk/blob/main/Deepgram/Clients/Listen/v1/WebSocket/Client.cs#L328
Thanks for the quick response David!
I don't see the point in waiting some amount of time and only catching 99% of cases, when all that is needed is to queue the request behind the audio waiting to go out. :) Note: I have already implemented CloseStream myself, so I'm in no rush - it was just a suggestion. Here's what I did for reference....
// This message will send a shutdown command to the server instructing it to finish processing any cached data,
// send the response to the client, send a summary metadata object, and then terminate the WebSocket connection.
// activeRecognition.SpeechClient.SendClose(); // Sends immediately - we want to queue it, so implementing manually below
byte[] data = Encoding.ASCII.GetBytes("{\"type\": \"CloseStream\"}");
speechClient.SendMessage(data); // queued to ensure all audio get's there first
On a related note (I'm hoping you can shed some light on): What is the purpose of the nullbyte optional bool argument on the current SendClose method?
/// <summary>
/// Sends a Close message to Deepgram
/// </summary>
public void SendClose(bool nullByte = false);
Thanks for your time, Scott
FYI - I see you send the CloseStream in the Stop API - however, since you are shutting down the websocket (cancelling it), then the application cannot receive any final transcription responses. Remember the whole problem I was trying to solve was knowing when all the results were in and it was ok to close to websocket.... without some sort of odd delay logic. :) Cheers :)
I see you send the CloseStream in the Stop API - however, since you are shutting down the websocket (cancelling it), then the application cannot receive any final transcription responses.
This is the part that needs to be fixed. There needs to be a slight delay before cancelling/exiting.
So the options are:
What is the purpose of the null byte optional bool argument on the current SendClose method?
The null byte is just another method for stopping or canceling the websocket. It is a deepgram-specific method that signals "I'm done" at the API level.
Merged. will have a release after addressing another issue
Proposed changes
Add an option to the current SendClose() API to be able to queue the request behind any audio packets still waiting to be sent.
Context
We are doing call transcriptions, using the ListenWebScoketClient. We need a reliable way to know when all transcriptions are complete so that we can perform post translation analysis (ie: Deepgram TextAnalysis) on the transcribed text. Initially I was using the Finalize request, and waiting for results where FromFinalize=True, however the documentation states there are cases where FromFinalize=True may not be returned. (https://developers.deepgram.com/docs/finalize). After speaking with a Deepgram integration engineer, they recommended to instead use the CloseStream request, where all pending audio would be processed, final results sent and then we are to wait for the websocket to be closed to know transcription is complete.
I have done this and it seems to be working well. However, since the SendClose API uses SendMessageImmediately, I have seen cases where I don't transcription text for last few seconds of audio. On a theory that perhaps there was still unsent audio queued up in the client, I implemented a new version of SendClose myself and used SendMessage instead to queue it behind all of the audio, and this appeared to fix the issue.
Possible Implementation
Since I could see that one might want SendMessageImmediately behaviour, and in my case I want the queued SendMessage behaviour, it might be best to add an option to the SendClose API to send immediately or send queued. Probably also makes sense to add this option to the SendFinalize API as well.
Somewhat related question
What is the purpose of the nullbyte optional bool argument on the current SendClose method?