Closed DenizPiri closed 2 years ago
@shangxiaoxiong is it related with your findings?
@ShooterIT Kvrocks didn't check the batch content of the same RocksDB sequence, so when SERVER1 and SERVER2 have the same sequence but the content was different cannot trigger the full sync.
The simple workaround can compare the latest batch content when the psync was ok, but it still has flaw.
Oh, i remembered. Yes, current dbname
check is not enough. I add this job into 3.0 project https://github.com/KvrocksLabs/kvrocks/projects/2#card-73962066
Oh, i remembered. Yes, current dbname check is not enough. I add this job into 3.0 project
Is this the problem we have communicated before? @ShooterIT
Are we considering adding a new DB to store unique run id or other unique information that doesn't need to be synchronized? @git-hulk
Oh, i remembered. Yes, current dbname check is not enough. I add this job into 3.0 project
Is this the problem we have communicated before? @ShooterIT
Are we considering adding a new DB to store unique run id or other unique information that doesn't need to be synchronized? @git-hulk
I think we can write it to the config file instead of a new DB, persist to propagate column was fine to me.
@ShooterIT unique id cannot fully resolve this issue, since the same sequence may have different batch content.
Use unique run id as replication identifier instead of db name
this means the unique id exists with one special replication, so unique id will be changed when changing replication relationship. The unique id identifies the replication history .
the same sequence may have different batch content
In this case, if using unique run id as replication identifier, it won't think partially resynchronization is possible.
We can see reproduce steps:
SERVER2: SLAVEOF SERVER1
SERVER2: SLAVEOF NO ONE
SERVER2: SET test 222
SERVER1: SET test 111
SERVER2: SLAVEOF SERVER1
SERVER2: GET test
the first step would sync the unique id, then we set the SERVER1 and SERVER2 without replication, so the SERVER1 and SERVER2 would have the same sequence(said 1 here), finally we re-slaveof again, the SERVER1 unique id should be same with the SERVER2? and psync is ok. Collect me if I miss something.
SERVER2: SLAVEOF SERVER1 --- use server1 replication id
SERVER2: SLAVEOF NO ONE --- generate new replication id
SERVER2: SET test 222
SERVER1: SET test 111
SERVER2: SLAVEOF SERVER1 --- different replication id, psync is not possible
SERVER2: GET test
/* Is the replication ID of this master the same advertised by the wannabe
* slave via PSYNC? If the replication ID changed this master has a
* different replication history, and there is no way to continue.
*
* Note that there are two potentially valid replication IDs: the ID1
* and the ID2. The ID2 however is only valid up to a specific offset. */
if (strcasecmp(master_replid, server.replid) &&
(strcasecmp(master_replid, server.replid2) ||
psync_offset > server.second_replid_offset))
Redis code, FYI
cool, so slaveof no one
would generate new unique id, right? it makes sense since it would NOT cause full replication too frequently.
Hi @DenizPiri I submit a commit to solve this problem, please have a look.
Describe the bug After SLAVEOF, 2 instances can contain different values for a given key.
To Reproduce Start 2 kvrocks instances and execute the following commands in the respective instances.
If a new SET command is used on SERVER1, SERVER2 immediately picks up the latest value. This issue doesn't occur if, on the "test" key, SERVER1 executed multiple SET commands while SERVER2 was not a slave.
Expected behavior After SLAVEOF command, SERVER2 should always contain identical data to SERVER1.