Open mleklund opened 3 years ago
There is talk of partial resynchronization, but I do not think this seems to be happening for active-active, may only apply to active-passive.
Hi Mleklund
Thank you for contacting EQAlpha. We appreciate you reaching out to us. To help us address your specific issue better.
In a worst case scenario, assuming you're using a mesh network topology of N master nodes, there will be up to N dump.rdb
file transfers for each other master node. This can lead to up to N*N file transfers for the entire network, leading to your transfer storms.
1) What is your current topology(ring versus mesh) and number of master nodes?
If you use unidirectional ring network topology for the initial synchronization, that can at least smoothen the storm a bit. Then re-add the additional connections later on.
We also are having the same issue + this full reload also can cause the nodes to run out of memory. We have 4GB data in keydb and each of three multi-master nodes do have 10GB of ram - this is not enough if we do rolling restart and the keydb rdb file is kept in each node. Only way currently to sort this out is to clean the RDB file on some of the nodes before the keydb is started.
I think this is related to the https://github.com/EQ-Alpha/KeyDB/issues/353 = allow having static master id for each node. This would prevent the clients to reload the full 4GB and run out of memory.
We've encountered this also. We have about the simplest configuration possible -- 3 nodes which are all replicas of each other, with active-active and multi-master enabled. Each node has 16GB RAM and currently we have in our development system about 500MB of data. If I have to take a node down to adjust the configuration we end up bouncing data around endlessly upon restarting it. I've increased the client-output-buffer replica limits to 2gb 1gb 60 but that only seems to cause the nodes to get close to being out of RAM before giving up.
@BobEQAlpha when you say N*N transfers, do you mean that this process will actually stop after 9 transfers? I'm not sure I've ever waited that long...
That is my experience @mikeries. It just was not acceptable performance for our use cases. I hope they get it figured out, because outside of this issue, it seems to be an outstanding product.
Maybe I'll try the ring configuration, and if that still doesn't work I'll start dropping the features that made me pick this over redis... Multi-master and/or Active-Active. I really don't want to deal with sentinel, but this needs to work.
@mikeries I believe it is actually 6 transfers rather than 9.
Let say you have 3 nodes A, B C.
A will transfer data to B, C. B will transfer data to A, C. C will transfer data to A, B.
Thanks for the clarification, but I'm not having any luck. I have tried a unidirectional ring configuration and turning off multi-master, but it seems that if I shutdown a node and restart it, the full database gets passed around and seems to grow larger and larger until one of the nodes runs out of RAM. If I'm understanding the log output, somehow a 1Gb rdb file grew to more than 11Gb in a few minutes.
When a node send its data to another node, it requires a replication stream that uses memory which is likely responsible for your out of memory issue.
Does your master nodes have any attached read-only replicas? Both maintaining a master-replica connection and propagating writes to a read-only replica requiressome memory.
I also had a two node setup where I had this issue, only way to get everything healthy again was to remove the rdb on one host and set repl-timeout 300
to prevent master timeout while the other host was recovering.
Is there anyone still experiencing this issue, will be helpful to understand how to prioritize this.
I gave up and am using the sentinels now.
I gave up as well.
Describe the bug
Certain common circumstances can cause a storm of state transfers.
To reproduce
In an active active cluster if you do a rolling restart of the nodes for something like configuration changes, or version updates you can cause a storm of complete state transfers and "Active Replica - LOADING Redis is loading the dataset in memory on full sync" errors.
Expected behavior
The purpose of running something like active-active is to give yourself write high availability especially across geographical regions, and because of the way code is currently implemented you have read but not write high availability.
Additional information
From the active replication wiki:
So, when a node disconnect to restart it restarts with a new UUID and any repl-backlog for that server is dropped on the floor, this requires a complete state transfer from any running nodes to the restarted node, and then a reciprocal state transfer from the restarted node to any running nodes. The replicas then more state transfers, increasing read latency and write unavailability. It's possible that
allow-write-during-load
fixes this, but it comes with ominous warnings in the keydb.conf file.