Snapchat / KeyDB

A Multithreaded Fork of Redis
https://keydb.dev
BSD 3-Clause "New" or "Revised" License
11.29k stars 570 forks source link

[BUG] New nodes in cluster not recognized by all others #677

Open geevarghesest opened 1 year ago

geevarghesest commented 1 year ago

Describe the bug

I'm running keydb 6.3.3 with flash enabled. In cluster mode when a master node is added after initial creation, the new node is not acknowledged by all other nodes.

To reproduce

Create three keydb master instances in cluster mode. Add another master node using command ex. keydb-cli -p 7000 --cluster add-node 127.0.0.1:7003 127.0.0.1:7000, only the node running on 7000 port acknowledge new 7003 node. Cluster configuration files of nodes on ports 7001 and 7002 does not have 7003 added. This causes errors during rebalance operations.

Node 127.0.0.1:7001 replied with error: IOERR error or timeout reading to target instance

Expected behavior

All nodes should acknowledge new master node automatically. In redis 7 this is working as expected.

Additional information

All nodes are running in docker containers with host network mode. Platform arm64

geevarghesest commented 1 year ago
root@keydb-dev:/data# keydb-cli -a okyes -p 7000 --cluster add-node 127.0.0.1:7003 127.0.0.1:7000
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:7003 to cluster 127.0.0.1:7000
>>> Performing Cluster Check (using node 127.0.0.1:7000)
M: 5140f5c16c90b2a9f04bee97ba03178855390305 127.0.0.1:7000
   slots:[5461-10922] (5462 slots) master
M: 69a283502922046b576f7d69ee0107b76fcabba6 127.0.0.1:7001
   slots:[0-5460] (5461 slots) master
M: 335cce9c01df9189bde3bb38d0985beef9c25475 127.0.0.1:7002
   slots:[10923-16383] (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:7003 to make it join the cluster.
[OK] New node added correctly.
root@keydb-dev:/data# keydb-cli -a okyes -p 7000 --cluster info 127.0.0.1:7000
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:7000 (5140f5c1...) -> 32513 keys | 5462 slots | 0 slaves.
127.0.0.1:7003 (74124202...) -> 0 keys | 0 slots | 0 slaves.
127.0.0.1:7001 (69a28350...) -> 32244 keys | 5461 slots | 0 slaves.
127.0.0.1:7002 (335cce9c...) -> 32459 keys | 5461 slots | 0 slaves.
[OK] 97216 keys in 4 masters.
5.93 keys per slot on average.
root@keydb-dev:/data# keydb-cli -a okyes -p 7000 --cluster rebalance 127.0.0.1:7000 --cluster-use-empty-masters
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing Cluster Check (using node 127.0.0.1:7000)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 4 nodes. Total weight = 4.00
Moving 1366 slots from 127.0.0.1:7000 to 127.0.0.1:7003

Node 127.0.0.1:7000 replied with error:
IOERR error or timeout writing to target instance
root@keydb-dev:/data#