StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.84k stars 1.5k forks source link

[BUG] Batches are executed out of order when using DNS connections #2752

Open yangbodong22011 opened 5 days ago

yangbodong22011 commented 5 days ago

Backgroud

Our user told us that when they use batch on our cloud (AlibabaCloud), their commands came out of order. code is below

namespace RedisTest;

using StackExchange.Redis;
using System;
using System.Collections.Generic;
using System.Linq;

public class RedisClient
{
    class Program
    {
        static void Main(string[] args)
        {
            ConfigurationOptions configurationOptions = ConfigurationOptions.Parse("r-bpxxxxpd.redis.rds.aliyuncs.com:6379,password=xxx,connectTimeout=2000");
            ConnectionMultiplexer redisConn = ConnectionMultiplexer.Connect(configurationOptions);
            var db = redisConn.GetDatabase();
            var batch = db.CreateBatch();
            batch.KeyDeleteAsync("testhash"); // UNLINK
            var keyval = new List<HashEntry>(10);
            for (int i = 0; i < 10; i++) {
                keyval.Add(new HashEntry(i, i));
            }
            batch.HashSetAsync("testhash", keyval.ToArray()); // HMSET
            batch.KeyExpireAsync("testhash", TimeSpan.FromSeconds(100)); // EXPIRE
            batch.Execute();
        }
    }
}

Ideally, they would get the following sequence:

But sometimes, the following sequence is produced:

Going a step further, we found that UNLINK and EXPIRE are sent over one TCP connection, but HMSET uses another TCP connection.

Why are there two links? We found that when connecting through a domain name, there will be two links in ConnectionMultiplexer#ServerSnapshot (one is the domain name, and the other is the IP resolved by the domain name (obtained through cluster nodes)). Therefore, two links will be returned randomly in AnyServer, causing commands to be assigned to different connections.

image

How to reproduction

  1. Find a server with DNS (I can provide Alibaba Cloud test environment for free)
  2. Run the above command

How to fix

During the initialization process, Lettuce will generate the result of the domain name as a URI, but will change the domain name to an alias. Please refer to https://github.com/redis/lettuce/commit/16f9e7525068a3887f5cc746c6b976720835600a

Version

shaofing commented 1 day ago

@mgravell Can you fix it ?

mgravell commented 1 day ago

This is ... odd; short term, switching up to transactions will enforce single connection, but... this is odd; I'd have to investigate, which I can add to the list, but I can't guarantee a "today" thing

worming004 commented 1 day ago

As both theses lines are not awaited

 batch.HashSetAsync("testhash", keyval.ToArray()); // HMSET
 batch.KeyExpireAsync("testhash", TimeSpan.FromSeconds(100)); // EXPIRE

There is no garantee of order. Just probability that line 1 would be executed earlier, but not garantee. Right ?

Edit: didn't see it is about batch. My bad, my remark is irrelevant

mgravell commented 1 day ago

It is reasonable to expect this to occur in order; what is described is definitely not "working as intended", but it doesn't sound trivial - it will need some investigation

yangbodong22011 commented 16 hours ago

@mgravell I sent you the account and password of the test instance environment via google mail.