Closed xiaoxichen closed 3 weeks ago
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Attention: Patch coverage is 51.35135%
with 18 lines
in your changes missing coverage. Please review.
Project coverage is 67.28%. Comparing base (
1a0cef8
) to head (8ccd8ee
). Report is 85 commits behind head on master.
Files with missing lines | Patch % | Lines |
---|---|---|
src/lib/replication/repl_dev/raft_repl_dev.cpp | 45.45% | 17 Missing and 1 partial :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This commit introduces a mechanism to garbage collect (GC) replication requests (rreqs) that may hang indefinitely, thereby consuming memory and disk resources unnecessarily. These rreqs can enter a hanging state under several circumstances, as outlined below:
Scenario with Delayed Commit:
Scenario with Leader Failure Before Data Completion:
Scenario with Leader Failure After Data Write:
This garbage collection process cleans up based on DSN. Any rreqs in
m_repl_key_req_map
, whose DSN is already committed (rreq->dsn < repl_dev->m_next_dsn
), will be GC'd. This is safe on the follower side, as the follower updatesm_next_dsn
during commit. Any DSN belowcur_dsn
should already be committed, implying that the rreq should already be removed fromm_repl_key_req_map
.On the leader side, since
m_next_dsn
is updated when sending out the proposal, it is not safe to clean up based onm_next_dsn
. Therefore, we explicitly skip the leader in this GC process.