StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.89k stars 1.51k forks source link

IDatabase.StringGetAsync different behaviour in .net 4.6.1 and .net 6.0 #2516

Closed abhisheksingh-ms closed 1 year ago

abhisheksingh-ms commented 1 year ago

My Web API (C#) running on Azure App service is using Azure Redis Cache with NuGet version 2.6.66. and connection string as {account}.redis.cache.windows.net,ssl=True,password={pwd},abortConnect=false

I have also added ConnectionFailed event but it is not getting called (nothing written to logs)

` class WrappedConnectionMultiplexer { public WrappedConnectionMultiplexer(string connectionString, bool preserveAsyncOrder, IMyAppLogger logger) { this.connectionMultiplexer = ConnectionMultiplexer.Connect(connectionString); this.connectionMultiplexer.ConnectionFailed += OnConnectionFailure; this.logger = logger; }

//this is not logging anything private void OnConnectionFailure(object sender, ConnectionFailedEventArgs eventArgs) { // this is ApplicationInsight logger this.IMyAppLogger.TrackTrace(eventArgs.Exception?.ToString()); }

//client code to read from cache

WrappedConnectionMultiplexer wrappedConnectionMultiplexer; if (!myDictionary.TryGetValue(connectionString, out wrappedConnectionMultiplexer)) { wrappedConnectionMultiplexer = new WrappedConnectionMultiplexer(connectionString, preserveAsyncOrder); myDictionary.AddOrUpdate(connectionString, (key) => wrappedConnectionMultiplexer, (key, x`) => wrappedConnectionMultiplexer); }

var cacheDatabase = await Task.FromResult(wrappedConnectionMultiplexer.GetDatabase()); await cacheDatabase.StringGetAsync(string key);

`

But if the key doesn't exist or account name is incorrect. I see different behavior in .NET framework 4.6.1 and .NET 6.0

.Net framework 4.6.1 StringGetAsync() will immediately return Or connection exception if account is incorrect in connection string.

.net 6.0 StringGetAsync() doesn't return (until the app monitoring code cancel the request after application timeout of 10 sec).

in both .NET versions, StringSetAsync is executed with CommandFlags.FireandForgot and call immediately return.

Question:

  1. Is this difference expected or there is different way to use redis in 6.0.
  2. why ConnectionFailed event is not called (nothing writtern to AI logs)
mgravell commented 1 year ago

2.6.66 is a year old; a number of bugs etc have been fixed in that time - possibly I wonder if this is related to the GC fix that was in one of them (I'm not going to click into each to find which exactly).

Try with NuGet latest; if it happens with latest: this might be interesting. But until you've tested with latest, I'm going to assume it is already fixed.

On Sun, 30 Jul 2023, 22:16 Abhishek, @.***> wrote:

My Web API (C#) running on Azure App service is using Azure Redis Cache with NuGet version 2.6.66. and connection string as {account}. redis.cache.windows.net,ssl=True,password={pwd},abortConnect=false

I have also added ConnectionFailed event but it is not getting called (nothing written to logs)

` class WrappedConnectionMultiplexer { public WrappedConnectionMultiplexer(string connectionString, bool preserveAsyncOrder, IMyAppLogger logger) { this.connectionMultiplexer = ConnectionMultiplexer.Connect(connectionString); this.connectionMultiplexer.ConnectionFailed += OnConnectionFailure; this.logger = logger; }

//this is not logging anything private void OnConnectionFailure(object sender, ConnectionFailedEventArgs eventArgs) { // this is ApplicationInsight logger this.IMyAppLogger.TrackTrace(eventArgs.Exception?.ToString()); }

//client code to read from cache

WrappedConnectionMultiplexer wrappedConnectionMultiplexer; if (!myDictionary.TryGetValue(connectionString, out wrappedConnectionMultiplexer)) { wrappedConnectionMultiplexer = new WrappedConnectionMultiplexer(connectionString, preserveAsyncOrder); myDictionary.AddOrUpdate(connectionString, (key) => wrappedConnectionMultiplexer, (key, x`) => wrappedConnectionMultiplexer); }

var cacheDatabase = await Task.FromResult(wrappedConnectionMultiplexer.GetDatabase()); await cacheDatabase.StringGetAsync(string key);

`

But if the key doesn't exist or account name is incorrect. I see different behavior in .NET framework 4.6.1 and .NET 6.0

.Net framework 4.6.1 StringGetAsync() will immediately return Or connection exception if account is incorrect in connection string.

.net 6.0 StringGetAsync() doesn't return (until the app monitoring code cancel the request after application timeout of 10 sec).

in both .NET versions, StringSetAsync is executed with CommandFlags.FireandForgot and call immediately return.

Question:

  1. Is this difference expected or there is different way to use redis in 6.0.
  2. why ConnectionFailed event is not called (nothing writtern to AI logs)

— Reply to this email directly, view it on GitHub https://github.com/StackExchange/StackExchange.Redis/issues/2516 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMCWKKRATHMPYYZJKF3XS3FJNBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKBRG43TIOBRHAZYFJDUPFYGLJLJONZXKZNFOZQWY5LFVIYTQMRYGA4DSOBVHGTXI4TJM5TWK4VGMNZGKYLUMU . You are receiving this email because you are subscribed to this thread.

Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

abhisheksingh-ms commented 1 year ago

@mgravell , i tried with latest NuGet version 2.6.122, but still same application hosting environment (App service) and same code but only .net 6.0 takes too much time each call to StringGetAsync takes ~5 sec. where .net 4.6.1 code with Nuget 2.6.66 takes < 1 sec. I am running code in Preprod environment with single user single request testing. there is no load on the test environment. can you suggest any doc/link for troubleshooting it. most of the links i got talked about exception, in my case i am not seeing any exception.

mgravell commented 1 year ago

Either your server located on the moon, or something is very weird here. Are you creating the multiplexer per call, or is this a pre-existing and long-lived multiplexer instance?

abhisheksingh-ms commented 1 year ago

application code and hosting environment(app service) and redis connection string is same for .net 4.6.1 and .NET 6.0. This is test environment and first request after application deployment has same latency issue.

.net 6.0 on latest Nuget (2.6.122) exception is (says took 5 sec)

StackExchange.Redis.RedisConnectionException The message timed out in the backlog attempting to send because no connection became available (5000ms) - Last Connection Exception: UnableToConnect on xxxxx.redis.cache.windows.net:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.122.38350, command=GET, timeout: 5000, inst: 0, qu: 0, qs: 0, aw: False, bw: CheckingForTimeout, last-in: 0, cur-in: 0, sync-ops: 0, async-ops: 224, serverEndpoint: xxxxx.redis.cache.windows.net:6380, conn-sec: n/a, aoc: 0, mc: 1/1/0, mgr: 10 of 10 available, clientName: STGFE000000(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=500,Max=1000), WORKER: (Busy=3,Free=32764,Min=16383,Max=32767), POOL: (Threads=7,QueuedItems=0,CompletedItems=119991,Timers=24), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts) UnableToConnect on xxxxxx.redis.cache.windows.net:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.6.122.38350

.net 4.6.1 exception (it fails immediately)

No connection is active/available to service this operation: GET {CachKey}; UnableToConnect on xxxxx.redis.cache.windows.net:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.1.28.64774, mc: 1/1/0, mgr: 10 of 10 available, clientName: StgFE000001, IOCP: (Busy=1,Free=999,Min=500,Max=1000), WORKER: (Busy=0,Free=32767,Min=16383,Max=32767), v: 2.1.28.64774 UnableToConnect on xxxxxx.redis.cache.windows.net:6380/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.1.28.64774

NickCraver commented 1 year ago

@abhisheksingh-ms I'm confused on what we're trying to solve here. Is the complaint that it doesn't fail instantly, but should still fail? The .NET 4.6.1 is on 2.1.28 not 2.6.x like earlier comments, so I'm not sure which things we're comparing here.

In more recent versions of the library, when you've set abortConnect=false (I can see that's your case by the aoc: 0), we'll buffer the command and try for the duration of a timeout while that connection attempts to connect. In both of these cases it looks like you're not connecting and failing which is the real upstream issue, the behavior difference is backlog functionality added in 2.5.x (this doesn't differ between .NET 4.6 and .NET 6.0). If you want to turn this off, you can change the BacklogPolicy on your config to fail immediately, but none of that will solve the "it didn't even connect" issue at hand.

abhisheksingh-ms commented 1 year ago

@NickCraver , thanks for adding clarity and confirming delay in failure is due to new backlog feature. For now this was my major concern and created ticket to understand if this is expected. Not able to connect was not intent for ticket.

If i understand, is it correct for prod scenarios (for faster response to client) : abortconnect=false and backlogpolicy=false. So that individual request in case of connection failure will bypass cache (without waiting 5 sec) and get data from actual source. and abortConnect=false will make sure for future request it will keep trying to connect.

my service API makes multiple calls to cache(for different data it needs to serve single client request) for single API request, total time by redis N 5 sec (for each backlog call to timeout) was causing significant big number. Since failures can happen in Prod also we don't want Prod to take N5 sec to fail for user. Currently due to business need our APIs don't have explicit request execution timeouts. Upgrading to .NET 6 and Redis package caused this time difference, so i created this issue.

I will prefer to keep the issue open for day/two, to test and confirm for success (no connection failure) i don't see any latency difference.

abhisheksingh-ms commented 1 year ago

thanks for confirming behavior change for upgraded NuGet package which by default adds 5sec for retry/backlog in case of failure, which was causing high response time for my API.