StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.85k stars 1.5k forks source link

Randomly got error Sentinel: The ConnectionMultiplexer is not a Sentinel in Kubernetes pod #2649

Closed KendoSai closed 4 months ago

KendoSai commented 4 months ago

Hi @mgravell, I'm using StackExchange.Redis 2.7.17 to connect to Redis Sentinel and facing this strange issue

There is a service with 8 pods running on Kubernetes, sometimes when there is a new deployment that spinned up a new service's pods, there will be 1 pod got this exception "Sentinel: The ConnectionMultiplexer is not a Sentinel connection. Detected as: Standalone". Currently we have to fix it manually by deleting this 'fault' pod, when new pod is intialized, the error is gone

Our connection strings are: kala-c1redis-01.abc.lc:26379;kala-c1redis-02.abc.lc:26379;kala-c1redis-03.abc.lc:26379 I have put the logger into the ConnectionMultiplexer.Connect(options, writer); and here is the log

Do you have any ideas on this?

[
  "Redis connection info: Total connect time: 7,096 ms",
  "Redis connection info: 08:39:51.0586:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0582:",
  "Redis connection info: Starting heartbeat...",
  "Redis connection info: 08:39:51.0572:",
  "Redis connection info: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago",
  "Redis connection info: 08:39:51.0569:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0568:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0567:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Disconnected; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0566:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0566:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0565:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0564:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0563:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0562:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Disconnected; sub: Disconnected; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0561:",
  "Redis connection info: Endpoint Summary:",
  "Redis connection info: 08:39:51.0560:",
  "Redis connection info:   Election: No primaries detected",
  "Redis connection info: 08:39:51.0559:",
  "Redis connection info: Election summary:",
  "Redis connection info: 08:39:51.0558:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0557:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0556:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0555:",
  "Redis connection info: Task summary:",
  "Redis connection info: 08:39:51.0553:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Endpoint is (Interactive: Disconnected, Subscription: Connecting)",
  "Redis connection info: 08:39:51.0552:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)",
  "Redis connection info: 08:39:51.0551:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Endpoint is (Interactive: Disconnected, Subscription: Disconnected)",
  "Redis connection info: 08:39:51.0550:",
  "Redis connection info: Endpoint summary:",
  "Redis connection info: 08:39:51.0549:",
  "Redis connection info: All 3 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=12,Free=32755,Min=1,Max=32767), POOL: (Threads=12,QueuedItems=6,CompletedItems=57,Timers=30)",
  "Redis connection info: 08:39:51.0547:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0542:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0536:",
  "Redis connection info: Connection failed: kala-c1redis-01.abc.lc:26379 (Interactive, UnableToConnect): UnableToConnect on kala-c1redis-01.abc.lc:26379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0528:",
  "Redis connection info: Connection failed: kala-c1redis-01.abc.lc:26379 (Subscription, UnableToConnect): UnableToConnect on kala-c1redis-01.abc.lc:26379/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0527:",
  "Redis connection info: Connection failed: kala-c1redis-02.abc.lc:26379 (Interactive, UnableToConnect): UnableToConnect on kala-c1redis-02.abc.lc:26379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0525:",
  "Redis connection info: Connection failed: kala-c1redis-02.abc.lc:26379 (Subscription, UnableToConnect): UnableToConnect on kala-c1redis-02.abc.lc:26379/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0524:",
  "Redis connection info: Connection failed: kala-c1redis-03.abc.lc:26379 (Interactive, UnableToConnect): UnableToConnect on kala-c1redis-03.abc.lc:26379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0522:",
  "Redis connection info: Connection failed: kala-c1redis-03.abc.lc:26379 (Subscription, UnableToConnect): UnableToConnect on kala-c1redis-03.abc.lc:26379/Subscription, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 6s ago, last-write: 6s ago, keep-alive: 180s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 7s ago, v: 2.7.17.27058",
  "Redis connection info: 08:39:51.0506:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0394:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0392:",
  "Redis connection info: Awaiting 3 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=12,Free=32755,Min=1,Max=32767), POOL: (Threads=12,QueuedItems=19,CompletedItems=37,Timers=30)",
  "Redis connection info: 08:39:51.0391:",
  "Redis connection info: Allowing 3 endpoint(s) 00:00:05 to respond...",
  "Redis connection info: 08:39:51.0388:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:51.0387:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:51.0386:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:51.0384:",
  "Redis connection info: 3 unique nodes specified (without tiebreaker)",
  "Redis connection info: 08:39:51.0383:",
  "Redis connection info:   Retrying - attempts left: 1...",
  "Redis connection info: 08:39:51.0380:",
  "Redis connection info: Resetting failing connections to retry...",
  "Redis connection info: 08:39:51.0348:",
  "Redis connection info: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago",
  "Redis connection info: 08:39:51.0347:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0345:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0344:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0343:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0342:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0340:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0339:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:51.0338:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=2",
  "Redis connection info: 08:39:51.0337:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:51.0335:",
  "Redis connection info: Endpoint Summary:",
  "Redis connection info: 08:39:51.0333:",
  "Redis connection info:   Election: No primaries detected",
  "Redis connection info: 08:39:51.0332:",
  "Redis connection info: Election summary:",
  "Redis connection info: 08:39:51.0331:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0330:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0328:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Returned, but incorrectly",
  "Redis connection info: 08:39:51.0327:",
  "Redis connection info: Task summary:",
  "Redis connection info: 08:39:51.0326:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:51.0325:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:51.0323:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:51.0321:",
  "Redis connection info: Endpoint summary:",
  "Redis connection info: 08:39:51.0320:",
  "Redis connection info: All 3 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=12,Free=32755,Min=1,Max=32767), POOL: (Threads=12,QueuedItems=13,CompletedItems=37,Timers=30)",
  "Redis connection info: 08:39:51.0317:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0141:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0138:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0030:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync completed (Disconnected)",
  "Redis connection info: 08:39:51.0026:",
  "Redis connection info: Awaiting 3 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=11,Free=32756,Min=1,Max=32767), POOL: (Threads=11,QueuedItems=12,CompletedItems=8,Timers=24)",
  "Redis connection info: 08:39:50.0827:",
  "Redis connection info: Allowing 3 endpoint(s) 00:00:05 to respond...",
  "Redis connection info: 08:39:50.0824:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:50.0823:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:50.0821:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:50.0819:",
  "Redis connection info: 3 unique nodes specified (without tiebreaker)",
  "Redis connection info: 08:39:50.0817:",
  "Redis connection info:   Retrying - attempts left: 2...",
  "Redis connection info: 08:39:50.0814:",
  "Redis connection info: Resetting failing connections to retry...",
  "Redis connection info: 08:39:50.0448:",
  "Redis connection info: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago",
  "Redis connection info: 08:39:50.0445:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:50.0440:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=1",
  "Redis connection info: 08:39:50.0439:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:50.0438:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:50.0436:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=1",
  "Redis connection info: 08:39:50.0435:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:50.0432:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Circular op-count snapshot; int: 0 (0.00 ops/s; spans 10s); sub: 0 (0.00 ops/s; spans 10s)",
  "Redis connection info: 08:39:50.0105:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: int ops=0, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0, socks=1",
  "Redis connection info: 08:39:49.9992:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Standalone v3.0.0, primary; keep-alive: 00:03:00; int: Connecting; sub: Connecting; not in use: DidNotRespond",
  "Redis connection info: 08:39:49.9953:",
  "Redis connection info: Endpoint Summary:",
  "Redis connection info: 08:39:49.9939:",
  "Redis connection info:   Election: No primaries detected",
  "Redis connection info: 08:39:49.9881:",
  "Redis connection info: Election summary:",
  "Redis connection info: 08:39:49.9876:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Did not respond (Task.Status: WaitingForActivation)",
  "Redis connection info: 08:39:49.9756:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Did not respond (Task.Status: WaitingForActivation)",
  "Redis connection info: 08:39:49.9755:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Did not respond (Task.Status: WaitingForActivation)",
  "Redis connection info: 08:39:49.9753:",
  "Redis connection info: Task summary:",
  "Redis connection info: 08:39:49.9751:",
  "Redis connection info:   kala-c1redis-03.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:49.9749:",
  "Redis connection info:   kala-c1redis-02.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:49.9748:",
  "Redis connection info:   kala-c1redis-01.abc.lc:26379: Endpoint is (Interactive: Connecting, Subscription: Connecting)",
  "Redis connection info: 08:39:49.9745:",
  "Redis connection info: Endpoint summary:",
  "Redis connection info: 08:39:49.9728:",
  "Redis connection info:   Server[2] (kala-c1redis-03.abc.lc:26379) Status: WaitingForActivation (inst: 0, qs: 0, in: 0, qu: 0, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: NotStarted. ws: Initializing)",
  "Redis connection info: 08:39:49.9727:",
  "Redis connection info:   Server[1] (kala-c1redis-02.abc.lc:26379) Status: WaitingForActivation (inst: 0, qs: 0, in: 0, qu: 0, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: NotStarted. ws: Initializing)",
  "Redis connection info: 08:39:49.9725:",
  "Redis connection info:   Server[0] (kala-c1redis-01.abc.lc:26379) Status: WaitingForActivation (inst: 0, qs: 0, in: 0, qu: 0, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: NotStarted. ws: Initializing)",
  "Redis connection info: 08:39:49.9721:",
  "Redis connection info: Not all available tasks completed cleanly (from ReconfigureAsync#1499, timeout 5000ms), IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=11,Free=32756,Min=1,Max=32767), POOL: (Threads=11,QueuedItems=12,CompletedItems=8,Timers=24)",
  "Redis connection info: 08:39:49.9628:",
  "Redis connection info: Awaiting 3 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=5,Free=32762,Min=1,Max=32767), POOL: (Threads=5,QueuedItems=13,CompletedItems=6,Timers=30)",
  "Redis connection info: 08:39:44.0890:",
  "Redis connection info: Allowing 3 endpoint(s) 00:00:05 to respond...",
  "Redis connection info: 08:39:44.0690:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:44.0671:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:44.0668:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: OnConnectedAsync init (State=Connecting)",
  "Redis connection info: 08:39:44.0663:",
  "Redis connection info: 3 unique nodes specified (without tiebreaker)",
  "Redis connection info: 08:39:44.0631:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379: BeginConnectAsync",
  "Redis connection info: 08:39:44.0619:",
  "Redis connection info: kala-c1redis-03.abc.lc:26379/Interactive: Connecting...",
  "Redis connection info: 08:39:44.0613:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379: BeginConnectAsync",
  "Redis connection info: 08:39:44.0605:",
  "Redis connection info: kala-c1redis-02.abc.lc:26379/Interactive: Connecting...",
  "Redis connection info: 08:39:44.0597:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379: BeginConnectAsync",
  "Redis connection info: 08:39:44.0503:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379/Interactive: Connecting...",
  "Redis connection info: 08:39:44.0309:",
  "Redis connection info: kala-c1redis-01.abc.lc:26379,kala-c1redis-02.abc.lc:26379,kala-c1redis-03.abc.lc:26379,serviceName=mymaster,keepAlive=180,ssl=False,abortConnect=False,connectRetry=3",
  "Redis connection info: 08:39:44.0209:",
  "Redis connection info: Connecting (sync) on .NET 6.0.27 (StackExchange.Redis: v2.7.17.27058)",
  "Redis connection info: 08:39:43.9631:"
]

Here is the RedisCacheClient code

public class RedisCacheClientFactory : IRedisCacheClientFactory, IDisposable
    {
        private readonly ConcurrentDictionary<Tuple<string, string>, Lazy<ConnectionMultiplexer>> _redisConnections;
        private readonly IConfiguration _configuration;
        private readonly ILogger<RedisCacheClientFactory> _logger;
        private readonly SemaphoreSlim _connectionLock = new SemaphoreSlim(1, 1);
        private bool _disposed;

        private DateTimeOffset _lastReconnectTime = DateTimeOffset.MinValue;
        private readonly TimeSpan _minReconnectInterval = TimeSpan.FromMinutes(1);
        private readonly object _reconnectLock = new object();

        private const string DefaultServiceName = "mymaster";
        private readonly bool _enableSsl;
        private readonly string _sslCertPath;
        private readonly bool _enableDetailLogger;

        public RedisCacheClientFactory(ILogger<RedisCacheClientFactory> logger, IConfiguration configuration)
        {
            _redisConnections = new ConcurrentDictionary<Tuple<string, string>, Lazy<ConnectionMultiplexer>>();
            _configuration = configuration;
            _logger = logger;

            bool.TryParse(_configuration["Redis:EnableSsl"], out _enableSsl);
            // SERedis_IssuerCertPath should be set in GitOps ConfigMaps of service/application
            _sslCertPath = _configuration["SERedis_IssuerCertPath"];
            bool.TryParse(_configuration["Redis:EnableDetailLogger"], out _enableDetailLogger);
        }

        public ConnectionMultiplexer GetInstance(string connectionString, string instanceName)
        {
            var key = new Tuple<string, string>(connectionString, instanceName);
            var lazyConnection = _redisConnections.GetOrAdd(key, _ => new Lazy<ConnectionMultiplexer>(() => ConnectWithRetry(connectionString, instanceName)));

            return lazyConnection.Value;
        }

        private ConnectionMultiplexer ConnectWithRetry(string connectionString, string instanceName)
        {
            var connection = CreateConnection(connectionString, instanceName);
            connection.ConnectionFailed += OnConnectionFailed;
            connection.ConnectionRestored += OnConnectionRestored;
            connection.InternalError += OnInternalError;
            connection.ErrorMessage += OnErrorMessage;
            return connection;
        }

        private void OnConnectionFailed(object sender, ConnectionFailedEventArgs e)
        {
            _logger.LogError("Redis error - Connection Failed: {message}", e.Exception?.Message);
            if (DateTimeOffset.UtcNow - _lastReconnectTime < _minReconnectInterval)
            {
                return;
            }

            lock (_reconnectLock)
            {
                if (DateTimeOffset.UtcNow - _lastReconnectTime < _minReconnectInterval)
                {
                    return;
                }

                _lastReconnectTime = DateTimeOffset.UtcNow;
                _logger.LogWarning("Attempting to reconnect to Redis...");

                foreach (var keyValuePair in _redisConnections)
                {
                    var key = keyValuePair.Key;
                    _connectionLock.Wait();
                    try
                    {
                        _redisConnections[key] = new Lazy<ConnectionMultiplexer>(() => CreateConnection(key.Item1, key.Item2));
                    }
                    finally
                    {
                        _connectionLock.Release();
                    }
                }
            }
        }

        private void OnConnectionRestored(object? sender, ConnectionFailedEventArgs e)
        {
            _logger.LogWarning("Redis error - connection restored");
        }

        private void OnInternalError(object? sender, InternalErrorEventArgs e)
        {
            _logger.LogError(e.Exception, "Redis error - internal: {Origin}", e.Origin);
        }

        private void OnErrorMessage(object? sender, RedisErrorEventArgs e)
        {
            _logger.LogError("Redis error: {Message}", e.Message);
        }

        private ConnectionMultiplexer CreateConnection(string connectionString, string instanceName)
        {
            ConfigurationOptions options = new ConfigurationOptions();
            options.TieBreaker = "";
            options.AbortOnConnectFail = false;
            options.Ssl = false;
            options.ConnectRetry = 3;
            options.ReconnectRetryPolicy = new ExponentialRetry(5000);
            options.KeepAlive = 180;
            options.ServiceName = instanceName ?? DefaultServiceName;
            options.SocketManager = SocketManager.Shared;
            options.IncludeDetailInExceptions = true;
            options.IncludePerformanceCountersInExceptions = true;

            if (_enableSsl)
            {
                if (string.IsNullOrEmpty(_sslCertPath))
                {
                    throw new Exception("Redis error: CA cert path not set! Check service GitOps ConfigMap variable");
                }

                options.Ssl = true;
                options.SslProtocols = SslProtocols.Tls12;
            }

            foreach (var item in connectionString.Split(';'))
            {
                options.EndPoints.Add(item);
            }

            if (_enableDetailLogger)
            {
                var writer = new RedisLoggerTextWriter(_logger, LogLevel.Warning);
                return ConnectionMultiplexer.Connect(options, writer);
            }

            return ConnectionMultiplexer.Connect(options);
        }

        public void Dispose()
        {
            if (_disposed) return;

            foreach (var connection in _redisConnections.Values)
            {
                if (connection.IsValueCreated)
                {
                    connection.Value.ConnectionFailed -= OnConnectionFailed;
                    connection.Value.ConnectionRestored -= OnConnectionRestored;
                    connection.Value.InternalError -= OnInternalError;
                    connection.Value.ErrorMessage -= OnErrorMessage;
                    connection.Value.Dispose();
                }
            }

            _disposed = true;
        }
    }
KendoSai commented 4 months ago

Updates: looks like this issue is relevant to our network and Istio usage in our Kubernetes cluster.

I used this annotation to exclude the 26379 and 6379 port and this issue stopped happen since this afternoon traffic.sidecar.istio.io/excludeOutboundPorts: "26379,6379"

https://istio.io/latest/docs/reference/config/annotations/#SidecarTrafficExcludeOutboundPorts

=> The issue still happened this morning, so it's not a correct fix.

KendoSai commented 4 months ago

Updated: we have tried disable Istio for this service and the issue has gone.

mgravell commented 4 months ago

I'll be honest: I have no idea what that means, but: I'm glad you found the problem. I wouldn't have stood a chance.

KendoSai commented 4 months ago

Thanks @mgravell, I'll close this issue then