Closed yuqi1129 closed 2 years ago
@tabokie Can you help to review this PR , unit test will be added later. I have some problem about this issue
raft_groups
have not been used in unsafe_truncate_with_file_builder
, i'm thinking how to use it and i have no idea whether we should use it in the replay
method.front
, if a file corrupted at beginning, we can't get the position where corrupt contents ends and normal logs startsMerging #175 (517f3b7) into master (7842aad) will decrease coverage by
0.16%
. The diff coverage is90.82%
.
@@ Coverage Diff @@
## master #175 +/- ##
==========================================
- Coverage 96.97% 96.80% -0.17%
==========================================
Files 26 27 +1
Lines 6703 7172 +469
==========================================
+ Hits 6500 6943 +443
- Misses 203 229 +26
Impacted Files | Coverage Δ | |
---|---|---|
src/file_pipe_log/mod.rs | 97.34% <ø> (ø) |
|
src/lib.rs | 100.00% <ø> (ø) |
|
src/truncate.rs | 85.76% <85.76%> (ø) |
|
src/engine.rs | 96.60% <88.88%> (+1.90%) |
:arrow_up: |
tests/failpoints/test_engine.rs | 99.08% <96.94%> (-0.92%) |
:arrow_down: |
src/consistency.rs | 91.11% <100.00%> (ø) |
|
src/file_pipe_log/pipe.rs | 99.11% <100.00%> (+<0.01%) |
:arrow_up: |
src/file_pipe_log/pipe_builder.rs | 91.93% <100.00%> (+0.08%) |
:arrow_up: |
src/memtable.rs | 99.51% <100.00%> (+<0.01%) |
:arrow_up: |
src/pipe_log.rs | 94.73% <100.00%> (ø) |
|
... and 3 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 7842aad...517f3b7. Read the comment docs.
truncate
recovers the data files to a bootable, consistent state. For a raft group with a missing log entry, we can accomplish that either by removing all the entries older than it ("front" mode), or removing all entries newer than it ("back" mode), or removing the whole raft group ("all" mode).
And yes, raft group id is needed, because the user might adopt different modes for different groups.
The code you just submit isn't good enough. You have to identify the "hole" in log entries by maintaining a map (id -> last_index) in TruncateMachine
.
About the file rename issue, you can directly rename from the new filename to the original filename. This operation is atomic, and the old file is deleted.
@tabokie
truncate recovers the data files to a bootable, consistent state. For a raft group with a missing log entry, we can accomplish that either by removing all the entries older than it ("front" mode), or removing all entries newer than it ("back" mode), or removing the whole raft group ("all" mode).
The unit is a file or a raft groups? , If we meet corruption in a file and we choose front
mode, Does this mean that we should drop all the message in the file that older than the corrupted message?
Another issue. Truncate
action needs to guarantee that files name stay the same ? that is if there are 2 files
A.write
B.write
After we execute the truncate command. there should be 2 files and one is A.write thats contains the content of A.write
before? or we can merge this two file into one says C.write. I'am very puzzled at this point.
And yes, raft group id is needed, because the user might adopt different modes for different groups.
If user only interested in some raft groups, that means we only need to care about that groups? what if corruptions lies in other raft groups when we visited log files, how do we handle this scenarios
The code you just submit isn't good enough. You have to identify the "hole" in log entries by maintaining a map (id -> last_index) in TruncateMachine.
You mean we should check whether the log index is sequential, that is some log has lost. For example, log index of a raft groups is 1, 2,3, 5 and 4 is lost, if the mode is front
, we should drop 1,2,3 ?
I'am looking forward your replay, thanks
replay
method (via TolerateAnyCorruption
mode), logical corruptions in unrelated raft groups will not be detected or handled.Physical corruptions are already handled in replay method (via TolerateAnyCorruption mode), logical corruptions in unrelated raft groups will not be detected or handled.
That is exactly what makes me confused, as you have mention, Physical corruption have been handle, so truncate
subcomand only needs to handle logical corruption. Because if there is physical corruption in a file, we can do nothing but drop the rest of it. As for logical corruption, say the log of one raft groups in a file is:
1,2, 3 ,4 6, 7 ,8 (5 is missing)
front
, we should keep 1,2,3,4back
we should keep 6,7,8all
, we would drop all logs in the file of a specific raft groups That' is quite clear, @tabokie Is there anything wrong with my unstanding?
@yuqi1129 Your understanding is correct.
@tabokie Can you take time to review this PR?
@tabokie All discussion has been solved, can you help to view again?
@tabokie A new commit has been pushed, if you have time, please to take a look to the latest code, thank you.
Closed in favor of #182
Signed-off-by: yuqi1129 yuqi4733@gmail.com