Use existing log for replication, even if purged from the PoV of `openraft`

schreter commented 1 month ago

Currently, openraft instructs the log storage to purge the log after a snapshot. If a lagging replica reconnect afterwards, then this snapshot is sent instead of log.

However, it's not always desirable to do so. Say, the snapshot is big, but frequent. Then, we can keep more log to be able to recover the lagging replica significantly more efficiently. The state machine can decide how much log to actually keep to balance between log replay time (and less network traffic) and snapshot replication (potentially less work on the follower, but much more network traffic). In our project, we'd like to keep some percentage of the log (say, 20% of the snapshot) to be able to recover lagging replicas from the log.

There are two approaches I can think of how to use the still-existing log:

purge() call can return the log ID to which the log was actually purged
before calling purge(), ask the storage, whether there is a log ID which should be kept and adjust the purge position accordingly
when deciding whether to send a snapshot or log, we could first ask the log to deliver logs starting with the required log ID

The former two are more straightforward for the implementors of the storage, since there is a well-defined point at which to purge logs/decide how much log to keep. OTOH, there are some assertions in openraft which may become invalid when implementing the first solution.

The last requires the storage implementation to keep the log being streamed also in presence of later purge calls, which is problematic, so much more complex to implement correctly in the storage.

My preferred solution would be the second one.

Opinions?

BTW, regarding purge calls, if I understand it correctly, the snapshot & purge can currently happen while replicating log to the follower, potentially causing the log reader to fail, since the log was purged concurrently. Any take on this?

If it is so, then we need to implement some sort of postponement of log purge anyway. Probably the second solution is the simplest one - in this case, we have two sources of purge postponement - current replication state and the state from the storage.

Update I found the InFlight handling, so I suppose, this question is moot and it works as it should.

github-actions[bot] commented 1 month ago

👋 Thanks for opening this issue!

Get help or engage by:

/help : to print help messages.
/assignme : to assign this issue to you.

schreter commented 1 month ago

Looking at the code, the second solution is actually pretty simple to realize by updating LogHandler::calc_purge_upto(), since it already has provisions to keep absolute number of logs. The only question is how to bring a custom call into there.

schreter commented 1 month ago

Further investigation has shown that it's possible to basically turn off purging and purge on-demand by the application. Thus, I'm closing this for now and will try this approach first.

databendlabs / openraft

Use existing log for replication, even if purged from the PoV of `openraft` #1260