Closed vizakgh closed 2 weeks ago
Hi @vizakgh
Thanks for the report. Will take a look.
I suspect that it's because you are setting the parameters like this (which is totally valid):
new DeepgramWsClientOptions
{
ApiKey = _dgApiKey,
KeepAlive = true
}
If you were to do this, I suspect it would probably work (https://github.com/deepgram/deepgram-dotnet-sdk/blob/main/examples/speech-to-text/websocket/microphone/Program.cs#L27-L30):
DeepgramWsClientOptions options = new DeepgramWsClientOptions(null, null, true);
var liveClient = new ListenWebSocketClient(_dgApiKey, options);
I will need to add some code to handle the setting via the properties.
Hi.
Thanks. I already use options ctor - it works.
P.S. I see stable error - DG closes socket without calling ErrorResponse handler. And mostly it happens on azure dev env (2 CPUs web app). Especially if I send audio chunks every 200ms (instead of 500ms). On my local machine (16 CPUs) it works correctly. I'm still investigating the issue. No error reason in log is real problem :(
I send native browser "webm" audio.
Regards, Victor
От: David vonThenen @.> Отправлено: 22 октября 2024 г. 21:40 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
Hi @vizakghhttps://github.com/vizakgh
Thanks for the report. Will take a look.
I suspect that it's because you are setting the parameters like this (which is totally valid):
new DeepgramWsClientOptions { ApiKey = _dgApiKey, KeepAlive = true }
If you were to do this, I suspect it would probably work (https://github.com/deepgram/deepgram-dotnet-sdk/blob/main/examples/speech-to-text/websocket/microphone/Program.cs#L27-L30):
DeepgramWsClientOptions options = new DeepgramWsClientOptions(null, null, true); var liveClient = new ListenWebSocketClient(_dgApiKey, options);
I will need to add some code to handle the setting via the properties.
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2429990172, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPRC7XIICMVBO5QCGKLZ42L3FAVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRZHE4TAMJXGI. You are receiving this because you were mentioned.Message ID: @.***>
You can try to enable more debugging (https://github.com/deepgram/deepgram-dotnet-sdk/blob/main/examples/speech-to-text/websocket/microphone/Program.cs#L20):
Deepgram.Library.Initialize(LogLevel.Debug); // LogLevel.Default, LogLevel.Debug, LogLevel.Verbose
I would recommend setting to Verbose
for max logging. If you copy and paste the output here, I can help debug. If the logging is sensitive, I would be glad to chat in Discord.
Usually when the server closes the connection when you have keep alive enabled, it's usually because the encoding or sample rate values do not match what the audio stream actually is.
Hi.
I tried also with WAV/PCM using old JS encoder (I used it with Gladia). The same result. Today it doesn't work at all (yesterday it was worked sometimes)
2024-10-23 17:57:36.507 [Debug] State: WebSocket State: Open 2024-10-23 17:57:36.507 [Verbose] ProcessSendQueue: Sending message... 2024-10-23 17:57:36.613 [Verbose] ProcessReceiveQueue: Received message: System.Net.WebSockets.WebSocketReceiveResult / System.IO.MemoryStream 2024-10-23 17:57:36.615 [Verbose] ListenWSClient.ProcessDataReceived: ENTER 2024-10-23 17:57:36.615 [Verbose] ProcessDataReceived: raw response: {"type":"Metadata","transaction_key":"deprecated","request_id":"deda472a-6bf0-4ea4-88c5-0d235032c6f4","sha256":"fc4ef741527af4eb3e4d0d7c81b8685111ac40c6fa806491a86c2b5ac137463f","created":"2024-10-23T14:57:23.420Z","duration":0.0,"channels":0} 2024-10-23 17:57:36.616 [Verbose] ProcessDataReceived: Type: Metadata 2024-10-23 17:57:36.635 [Debug] ProcessDataReceived: Invoking MetadataResponse. event: { "channels": 0, "created": "2024-10-23T14:57:23.42Z", "duration": 0, "request_id": "deda472a-6bf0-4ea4-88c5-0d235032c6f4", "sha256": "fc4ef741527af4eb3e4d0d7c81b8685111ac40c6fa806491a86c2b5ac137463f", "transaction_key": "deprecated", "type": "Metadata" } 2024-10-23 17:57:36.649 [Debug] ProcessDataReceived: Succeeded 2024-10-23 17:57:36.650 [Verbose] ListenWSClient.ProcessDataReceived: LEAVE 2024-10-23 17:57:36.651 [Information] ProcessReceiveQueue: Received WebSocket Close. Trigger cancel... 2024-10-23 17:57:36.652 [Verbose] ListenWSClient.Stop: ENTER 2024-10-23 17:57:36.652 [Information] Stop: Using default disconnect cancellation token
initialization: var liveClient = new ListenWebSocketClient(_dgApiKey, new DeepgramWsClientOptions(_dgApiKey, null, true));
var liveSchema = new LiveSchema { Model = "nova-2", Language ="en", Diarize = false, Dictation = false, Punctuate = false, SmartFormat = false };
P.S. I dumped both files (wbem and wav) on backend. Files are correct (I can open wbem file with win media player).
Regards, Victor
От: David vonThenen @.> Отправлено: 22 октября 2024 г. 23:12 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
Usually when the server closes the connection when you have keep alive enabled, it's usually because the encoding or sample rate values do not match what the audio stream actually is.
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2430166662, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPRBYVWNOVGAQUGH6JTZ42WT7AVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZQGE3DMNRWGI. You are receiving this because you were mentioned.Message ID: @.***>
You need to set the encoding
and sample_rate
values in LiveSchema. Depending on the form of chunking that either the audio source or you might be doing, those parameters are required. Please see my previous message.
The connection is exiting within a second which is a clear indication of this failure.
it doesn't help. According to docs DG supports only Ogg opus (not opus in wbem container), right?
Ffmpeg report: [cid:8cb737e5-61ab-4899-a725-ddf7dfd27fb4]
schema: var liveSchema = new LiveSchema { Model = "nova-2", Language = "en", Channels = 1, Encoding = "opus", SampleRate = 48000, Diarize = false, Dictation = false, Punctuate = false, SmartFormat = false };
От: David vonThenen @.> Отправлено: 23 октября 2024 г. 18:19 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
You need to set the encoding and sample_rate values in LiveSchema. Depending on the form of chunking that either the audio source or you might be doing, those parameters are required.
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2432595769, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPTUL43W6TT6PL3NOH3Z4647HAVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZSGU4TKNZWHE. You are receiving this because you were mentioned.Message ID: @.***>
That is correct, ogg opus
. The source audio streaming needs to be one of the supported formats. Depending on the platform, you can change the audio output from the source to one of the supported formats.
But DG supports webm/opus - I have unit tests : microsoft speech SDK -> webm/opus -> Deepgram. This flow works. Speech SDK format: SpeechSynthesisOutputFormat.Webm24Khz16BitMonoOpus.
So, this works (encoded by MS speech SDK): [cid:eedf7e39-0eb2-4f54-90fe-080cca60e1d8]
This doesn't work or works time by time (encode by Edge/Chrome): [cid:b22fcf43-a70f-434f-a774-d21f6b287871]
От: David vonThenen @.> Отправлено: 23 октября 2024 г. 18:32 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
That is correct, ogg opus. The source audio streaming needs to be one of the supported formats. Depending on the platform, you can change the audio output from the source to one of the supported formats.
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2432637603, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPQLPES4D4ZCCMTGAH3Z466SVAVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZSGYZTONRQGM. You are receiving this because you were mentioned.Message ID: @.***>
I assume the LiveOptions are the same, correct? This would point to an issue with the audio encoding on Edge/Chrome or the options not patching.
We do have users in both those camps, so this should be a configuration problem matching the stream.
If you also want to troubleshoot this via Discord since I am fairly certain this is a configuration issue, it might go a lot faster than going back and forth on this issue here.
Schema is the same: var liveSchema = new LiveSchema { Model = "nova-2", Language = "en", Channels = 1, Diarize = false, Dictation = false, Punctuate = false, SmartFormat = false };
For sample code - I will contact you tomorrow in Discord.
It can be audio chunks issue. In UT "MS speech SDK" sends whole audio stream very fast. In real app browser sends audio by chunks (once per 500ms) with human speaking speed. Ok. Let's continue in Discord tomorrow. Thanks.
Regards, Victor
От: David vonThenen @.> Отправлено: 23 октября 2024 г. 20:18 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
I assume the LiveOptions are the same, correct? This would point to an issue with the audio encoding on Edge/Chrome or the options not patching.
We do have users in both those camps, so this should be a configuration problem matching the stream.
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2432921844, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPWTOHRSXU7F42XE5N3Z47K7FAVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZSHEZDCOBUGQ. You are receiving this because you were mentioned.Message ID: @.***>
For the original issue with:
var liveClient = new ListenWebSocketClient(_dgApiKey, new DeepgramWsClientOptions
{
ApiKey = _dgApiKey,
KeepAlive = true
});
If _dgApiKey
is set to ""
or a real API key, initializing the ListenWebSocketClient
works for me. What value are you using for _dgApiKey
?
Hi. This code throws exception. DG version:
От: David vonThenen @.> Отправлено: 24 октября 2024 г. 0:42 Кому: deepgram/deepgram-dotnet-sdk @.> Копия: vizak @.>; Mention @.> Тема: Re: [deepgram/deepgram-dotnet-sdk] Deepgram API Key is invalid using DeepgramWsClientOptions (Issue #342)
For the original issue with:
var liveClient = new ListenWebSocketClient(_dgApiKey, new DeepgramWsClientOptions { ApiKey = _dgApiKey, KeepAlive = true });
If _dgApiKey is set to "" or a real API key, initializing the ListenWebSocketClient works for me. What value are you using for _dgApiKey?
— Reply to this email directly, view it on GitHubhttps://github.com/deepgram/deepgram-dotnet-sdk/issues/342#issuecomment-2433519124, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTBAPVKPBMI6COQC7V5A6TZ5AJ3ZAVCNFSM6AAAAABQMAC3D2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZTGUYTSMJSGQ. You are receiving this because you were mentioned.Message ID: @.***>
this code throws exception: / // Constructor // public DeepgramWsClientOptions(string? apiKey = null, string? baseAddress = null, bool? keepAlive = null, bool? onPrem = null, Dictionary<string, string>? addons = null, Dictionary<string, string>? headers = null) { ...
ApiKey = apiKey ?? "";
...
// user provided takes precedence
if (string.IsNullOrWhiteSpace(ApiKey))
{
// then try the environment variable
Log.Debug("DeepgramWsClientOptions", "API KEY is not set");
ApiKey = Environment.GetEnvironmentVariable(variable: Defaults.DEEPGRAM_API_KEY) ?? "";
if (!string.IsNullOrEmpty(ApiKey))
{
Log.Information("DeepgramWsClientOptions", "API KEY set from environment variable");
} else {
Log.Warning("DeepgramWsClientOptions", "API KEY environment variable not set");
}
}
if (!OnPrem && string.IsNullOrEmpty(ApiKey))
{
var exStr = "Deepgram API Key is invalid";
**throw new ArgumentException(exStr);**
Will take a look at this...
this should be available in the latest release
What is the current behavior?
Exception "Deepgram API Key is invalid"
Steps to reproduce
var liveClient = new ListenWebSocketClient(_dgApiKey, new DeepgramWsClientOptions { ApiKey = _dgApiKey, KeepAlive = true });
Expected behavior
no exceptions
Please tell us about your environment
Win11 last VS 2022 last
Other information
this works correctly: var liveClient = new ListenWebSocketClient(_dgApiKey, new DeepgramWsClientOptions(_dgApiKey, null, true));