storage 'erase_from' does not work

ultrabug commented 7 years ago

Hi

I have been testing multiple times to "corrupt" a log file by removing a random line from it to test the log trimming functionality and it fails to catch up to the leader last log.

https://github.com/zhebrak/raftos/blob/master/raftos/state.py#L413

On my debugging, it seems related to the last_log_index calculation or a sort of mismatch between the leader and the follower indexes but I couldn't find out the fix yet.

Would you be so kind to have a look and tell me if you can reproduce this or if you have an idea on the fix?

Thanks!

zhebrak commented 7 years ago

Hey, @ultrabug! Raft algorithm does not provide this kind of corruption fixing mechanism.

In your example:

We commit several log entries for the follower.
Followers' last log entry index and term matches correct one from the leader.
We try to send a new update and follower only checks previous entry, not the entire log file.
Everything commits and we have correct current state with mismatching logs in the past.

However, if you remove some log entries from the top or add couple wrong ones, it should fix it.

You can check out Raft paper, page 4 — https://raft.github.io/raft.pdf

Let me know if I'm wrong :)

ultrabug commented 7 years ago

Thanks for your quick reply @zhebrak.

I know that the corruption fixing is up for the implementation. However I think we have a race condition in raftos erase_from logic.

If the log is small, the bad log from the follower is correctly reset but when the log is large it is not.

zhebrak commented 7 years ago

Are you sure it's not the last entry or entries that you remove when it's fixing itself correctly? Follower handles updates synchronously, so I can't see any race conditions here.

ultrabug commented 7 years ago

Are you sure it's not the last entry or entries that you remove when it's fixing itself correctly?

I'm sorry, I don't understand what you mean ?

zhebrak commented 7 years ago

Follower only fixes log entries when the last one does not match leaders' index & term. So it's not possible to erase something when current entry checks out. If the log is small the probability of removing exactly the last log entry is higher (with random choice), that's why I asked you if that's the case.

ultrabug commented 7 years ago

Follower only fixes log entries when the last one does not match leaders' index & term

This is not what I understand form the code actually

        # If an existing entry conflicts with a new one (same index but different terms),
        # delete the existing entry and all that follow it
        new_index = data['prev_log_index'] + 1
        try:
            if self.log[new_index]['term'] != data['term'] or (
                self.log.last_log_index != prev_log_index
            ):
                self.log.erase_from(new_index)
        except IndexError:
            pass

Looks to me that it can be triggered at any line of the log when the follower comes back up and is fast-forwarding from the leader. If for a reason line 500 of a 2000 lines log matches this condition, we would remove the last 1501 lines from the log and we should start back from 499.

I guess I'll have to provide a gist to replicate the issue so you can see it happening by yourself.

zhebrak commented 7 years ago

Yes, I as said before, it fixes all inconsistency until the last followers' index and term match the one coming from the leader. It removes everything after the matching index-term pair.

ultrabug commented 7 years ago

Ok I have created a reproducible scenario here:

http://ultrabug.fr/github/raftos_gist.tar.gz

To use it after unpacking the archive:

use two terminals (two nodes to run)
cd raftos_gist
on the first terminal run python3 worker.py -i 2
on the second terminal run python3 worker.py -i 3

On another terminal you can check the logs of the worker id 3 : logs/127.0.0.1_8003.log

Expected result: trimmed log file getting fast forwarded up to 1463 lines (just like logs/127.0.0.1_8002.log)
Experienced result: infinite loop on rewriting of the log file

You can also reproduce this by:

stopping the workers
copying logs/127.0.0.1_8002.log to logs/127.0.0.1_8003.log
opening logs/127.0.0.1_8003.log
removing a random line from logs/127.0.0.1_8003.log
starting both workers again

zhebrak / raftos

storage 'erase_from' does not work #16