tikv / raft-engine

A persistent storage engine for Multi-Raft log
Apache License 2.0
565 stars 88 forks source link

support `truncate` subcommand #175

Closed yuqi1129 closed 2 years ago

yuqi1129 commented 2 years ago

Signed-off-by: yuqi1129 yuqi4733@gmail.com

yuqi1129 commented 2 years ago

@tabokie Can you help to review this PR , unit test will be added later. I have some problem about this issue

  1. Is this PR is on the right way ? if not, can you give me some advice, help wanted
  2. Parameter raft_groups have not been used in unsafe_truncate_with_file_builder, i'm thinking how to use it and i have no idea whether we should use it in the replay method.
  3. In the replay method, I'am using rename original file to tmp and move new file to the original file. this seems to be not atomic and we can't guarantee both is ok
  4. how to deal with the original file if we successfully truncate it? just remove it?
yuqi1129 commented 2 years ago
  1. I can't clearly understand the truncate mode front, if a file corrupted at beginning, we can't get the position where corrupt contents ends and normal logs starts
codecov[bot] commented 2 years ago

Codecov Report

Merging #175 (517f3b7) into master (7842aad) will decrease coverage by 0.16%. The diff coverage is 90.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #175      +/-   ##
==========================================
- Coverage   96.97%   96.80%   -0.17%     
==========================================
  Files          26       27       +1     
  Lines        6703     7172     +469     
==========================================
+ Hits         6500     6943     +443     
- Misses        203      229      +26     
Impacted Files Coverage Δ
src/file_pipe_log/mod.rs 97.34% <ø> (ø)
src/lib.rs 100.00% <ø> (ø)
src/truncate.rs 85.76% <85.76%> (ø)
src/engine.rs 96.60% <88.88%> (+1.90%) :arrow_up:
tests/failpoints/test_engine.rs 99.08% <96.94%> (-0.92%) :arrow_down:
src/consistency.rs 91.11% <100.00%> (ø)
src/file_pipe_log/pipe.rs 99.11% <100.00%> (+<0.01%) :arrow_up:
src/file_pipe_log/pipe_builder.rs 91.93% <100.00%> (+0.08%) :arrow_up:
src/memtable.rs 99.51% <100.00%> (+<0.01%) :arrow_up:
src/pipe_log.rs 94.73% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7842aad...517f3b7. Read the comment docs.

tabokie commented 2 years ago

truncate recovers the data files to a bootable, consistent state. For a raft group with a missing log entry, we can accomplish that either by removing all the entries older than it ("front" mode), or removing all entries newer than it ("back" mode), or removing the whole raft group ("all" mode).

And yes, raft group id is needed, because the user might adopt different modes for different groups.

The code you just submit isn't good enough. You have to identify the "hole" in log entries by maintaining a map (id -> last_index) in TruncateMachine.

About the file rename issue, you can directly rename from the new filename to the original filename. This operation is atomic, and the old file is deleted.

yuqi1129 commented 2 years ago

@tabokie

truncate recovers the data files to a bootable, consistent state. For a raft group with a missing log entry, we can accomplish that either by removing all the entries older than it ("front" mode), or removing all entries newer than it ("back" mode), or removing the whole raft group ("all" mode).

The unit is a file or a raft groups? , If we meet corruption in a file and we choose front mode, Does this mean that we should drop all the message in the file that older than the corrupted message? Another issue. Truncate action needs to guarantee that files name stay the same ? that is if there are 2 files A.write B.write After we execute the truncate command. there should be 2 files and one is A.write thats contains the content of A.write before? or we can merge this two file into one says C.write. I'am very puzzled at this point.

And yes, raft group id is needed, because the user might adopt different modes for different groups.

If user only interested in some raft groups, that means we only need to care about that groups? what if corruptions lies in other raft groups when we visited log files, how do we handle this scenarios

The code you just submit isn't good enough. You have to identify the "hole" in log entries by maintaining a map (id -> last_index) in TruncateMachine.

You mean we should check whether the log index is sequential, that is some log has lost. For example, log index of a raft groups is 1, 2,3, 5 and 4 is lost, if the mode is front, we should drop 1,2,3 ?

I'am looking forward your replay, thanks

tabokie commented 2 years ago
yuqi1129 commented 2 years ago

Physical corruptions are already handled in replay method (via TolerateAnyCorruption mode), logical corruptions in unrelated raft groups will not be detected or handled.

That is exactly what makes me confused, as you have mention, Physical corruption have been handle, so truncate subcomand only needs to handle logical corruption. Because if there is physical corruption in a file, we can do nothing but drop the rest of it. As for logical corruption, say the log of one raft groups in a file is: 1,2, 3 ,4 6, 7 ,8 (5 is missing)

That' is quite clear, @tabokie Is there anything wrong with my unstanding?

tabokie commented 2 years ago

@yuqi1129 Your understanding is correct.

yuqi1129 commented 2 years ago

@tabokie Can you take time to review this PR?

yuqi1129 commented 2 years ago

@tabokie All discussion has been solved, can you help to view again?

yuqi1129 commented 2 years ago

@tabokie A new commit has been pushed, if you have time, please to take a look to the latest code, thank you.

tabokie commented 2 years ago

Closed in favor of #182