[BUG] Incorrect data after SLAVEOF

apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.

https://kvrocks.apache.org/

Apache License 2.0

3.54k stars 465 forks source link

[BUG] Incorrect data after SLAVEOF #462

Closed DenizPiri closed 2 years ago

DenizPiri commented 2 years ago

Describe the bug After SLAVEOF, 2 instances can contain different values for a given key.

To Reproduce Start 2 kvrocks instances and execute the following commands in the respective instances.

SERVER2: SLAVEOF SERVER1
SERVER2: SLAVEOF NO ONE
SERVER2: SET test 222
SERVER1: SET test 111
SERVER2: SLAVEOF SERVER1
SERVER2: GET test
- Returns "222". However, it should return "111".

If a new SET command is used on SERVER1, SERVER2 immediately picks up the latest value. This issue doesn't occur if, on the "test" key, SERVER1 executed multiple SET commands while SERVER2 was not a slave.

Expected behavior After SLAVEOF command, SERVER2 should always contain identical data to SERVER1.

ShooterIT commented 2 years ago

@shangxiaoxiong is it related with your findings?

git-hulk commented 2 years ago

@ShooterIT Kvrocks didn't check the batch content of the same RocksDB sequence, so when SERVER1 and SERVER2 have the same sequence but the content was different cannot trigger the full sync.

The simple workaround can compare the latest batch content when the psync was ok, but it still has flaw.

ShooterIT commented 2 years ago

Oh, i remembered. Yes, current dbname check is not enough. I add this job into 3.0 project https://github.com/KvrocksLabs/kvrocks/projects/2#card-73962066

caipengbo commented 2 years ago

Oh, i remembered. Yes, current dbname check is not enough. I add this job into 3.0 project

Is this the problem we have communicated before? @ShooterIT

Are we considering adding a new DB to store unique run id or other unique information that doesn't need to be synchronized? @git-hulk

git-hulk commented 2 years ago

Oh, i remembered. Yes, current dbname check is not enough. I add this job into 3.0 project

Is this the problem we have communicated before? @ShooterIT

Are we considering adding a new DB to store unique run id or other unique information that doesn't need to be synchronized? @git-hulk

I think we can write it to the config file instead of a new DB, persist to propagate column was fine to me.

@ShooterIT unique id cannot fully resolve this issue, since the same sequence may have different batch content.

ShooterIT commented 2 years ago

Use unique run id as replication identifier instead of db name

this means the unique id exists with one special replication, so unique id will be changed when changing replication relationship. The unique id identifies the replication history .

the same sequence may have different batch content

In this case, if using unique run id as replication identifier, it won't think partially resynchronization is possible.

git-hulk commented 2 years ago

We can see reproduce steps:

SERVER2: SLAVEOF SERVER1
SERVER2: SLAVEOF NO ONE
SERVER2: SET test 222
SERVER1: SET test 111
SERVER2: SLAVEOF SERVER1
SERVER2: GET test

the first step would sync the unique id, then we set the SERVER1 and SERVER2 without replication, so the SERVER1 and SERVER2 would have the same sequence(said 1 here), finally we re-slaveof again, the SERVER1 unique id should be same with the SERVER2? and psync is ok. Collect me if I miss something.

ShooterIT commented 2 years ago

SERVER2: SLAVEOF SERVER1 --- use server1 replication id
SERVER2: SLAVEOF NO ONE --- generate new replication id
SERVER2: SET test 222
SERVER1: SET test 111
SERVER2: SLAVEOF SERVER1 --- different replication id, psync is not possible
SERVER2: GET test

ShooterIT commented 2 years ago

    /* Is the replication ID of this master the same advertised by the wannabe
     * slave via PSYNC? If the replication ID changed this master has a
     * different replication history, and there is no way to continue.
     *
     * Note that there are two potentially valid replication IDs: the ID1
     * and the ID2. The ID2 however is only valid up to a specific offset. */
    if (strcasecmp(master_replid, server.replid) &&
        (strcasecmp(master_replid, server.replid2) ||
         psync_offset > server.second_replid_offset))

Redis code, FYI

git-hulk commented 2 years ago

cool, so slaveof no one would generate new unique id, right? it makes sense since it would NOT cause full replication too frequently.

ShooterIT commented 2 years ago

Hi @DenizPiri I submit a commit to solve this problem, please have a look.