redis / go-redis

Redis Go client
https://redis.uptrace.dev
BSD 2-Clause "Simplified" License
19.6k stars 2.31k forks source link

redis client in ipv6 environment is not working #2978

Open panagiotspappas opened 2 months ago

panagiotspappas commented 2 months ago

Issue tracker is used for reporting bugs and discussing new features. Please use stackoverflow for supporting issues.

When using redis-client 9.5.1 we have issue at ipv6 environment.

We have a kubernetes environment and we are using redis cluster.

In our application we first create a new redis client

client = redis.NewClusterClient(opt)

with address redisdb service name: xxx-crdb.test.svc.cluster.local.:6379

Then we are checking connection with resp, err := c.redisClient.Ping(context.Background()).Result()

Ping is successful.

After that are are creating a consumer group: c.redisClient.XGroupCreateMkStream(context.Background(), streamName, groupName, start).Err()

The later command fails with redis-client 9.5.1 with error: dial tcp: address fd01:abcd::7d03:6379: too many colons in address.

We have many retries and it constantly fails.

With redis-client 9.3.0 it fails once and then it succeeds.

monkey92t commented 2 months ago

Thank you for your feedback. go-redis has not been adapted for IPv6. Are you interested in submitting a PR?

panagiotspappas commented 2 months ago

Hi, I have a question. How come ping succeeds and the creation of group fails? I mean, don't these two commands use/establish the same connection? Is it possible to point to the differences between Ping and XGroupCreateMkStream, regarding to the connection?

monkey92t commented 2 months ago

I'm not sure where the problem lies. In cluster mode, go-redis perceives all nodes within the cluster through the cluster slots command, using their IP addresses and ports to establish connections. However, I've checked all operations involving IP addresses, and they all use net.JoinHostPort(), which requires an IPv6 Redis cluster for testing.

monkey92t commented 2 months ago

Can you print the result of cluster slots?

See: https://redis.io/docs/latest/commands/cluster-slots/

daviddzxy commented 2 months ago

I also encountered this error. Seems like moved, ask, addr = isMovedError(lastErr) in func (c *ClusterClient) process(ctx context.Context, cmd Cmder) error function returns IPv6 in bad format. I will try to make PR with a fix.

daviddzxy commented 2 months ago

Here is the PR @monkey92t

monkey92t commented 2 months ago

I currently do not have more devices to build an IPv6 cluster for testing. I am not sure about the data format of MOVE ADDR. If it looks like MOVE fd01:abcd::7d03:6379, then it is not a valid IPv6 format. However, after Redis fixes it, it may look like MOVE [fd01:abcd::7d03]:6379, which could cause compatibility issues.

monkey92t commented 2 months ago

redis ipv6 issue https://github.com/redis/redis/issues/6238

It looks like we can make some compatibility adjustments to handle both fd01:abcd::7d03:6379 and [fd01:abcd::7d03]:6379 types of IPv6 addresses properly. @daviddzxy, can you continue?

monkey92t commented 2 months ago

Moreover, under normal circumstances, go-redis should not encounter MOVE. Before sending a command, go-redis calculates the node for the key and sends the Redis command to that node. This type of issue only occurs when Redis is reassigning slots, and it seems we triggered another bug that caused slot calculation errors.

daviddzxy commented 2 months ago

redis ipv6 issue redis/redis#6238

It looks like we can make some compatibility adjustments to handle both fd01:abcd::7d03:6379 and [fd01:abcd::7d03]:6379 types of IPv6 addresses properly. @daviddzxy, can you continue?

Sure, I ll update the PR.

daviddzxy commented 2 months ago

I updated the PR. @monkey92t

daviddzxy commented 2 months ago

I have done some more debugging. It turns out that when XGroupCreateMkStream function is called the func cmdFirstKeyPos(cmd Cmder) int function returns wrong pos number (It returns 1 instead of 2). As a result of this, the function func Slot(key string) int will receive "create" key argument. This causes incorrectly calculated slot number and therefore we get MOVED error. @monkey92t

daviddzxy commented 2 months ago

Should I create another issue for this bug? I plan to make a PR with a fix later today.

monkey92t commented 2 months ago

Should I create another issue for this bug? I plan to make a PR with a fix later today.

OK!

daviddzxy commented 2 months ago

Here is the issue