Closed WenyXu closed 1 week ago
The updates enhance the handling of failure detectors for region migrations. Specifically, the register_failure_detectors
method now accommodates failed regions, and a new deregister_failure_detectors
method has been introduced. Changes include modifying parameters for these methods based on to_peer
instead of from_peer
. Additionally, the control flow in upgrade_candidate_region.rs
has been adjusted by adding an awaited call to deregister_failure_detectors()
before opening the region guard.
Files & Changes | Summary |
---|---|
src/meta-srv/src/procedure/region_migration.rs |
Modified register_failure_detectors and added deregister_failure_detectors to handle failure detectors. |
src/.../region_migration/update_metadata/upgrade_candidate_region.rs |
Added deregister_failure_detectors() call before opening the region guard, affecting control flow. |
sequenceDiagram
autonumber
participant Context
participant RegionMigration
participant FailureDetector
participant MetadataUpdater
Context ->> RegionMigration: startMigration()
RegionMigration ->> Context: registerFailureDetectors(to_peer)
Context ->> FailureDetector: register(to_peer)
Note over RegionMigration, Context: Migration process...
RegionMigration ->> MetadataUpdater: updateMetadata()
MetadataUpdater ->> Context: deregisterFailureDetectors()
Context ->> FailureDetector: deregister(to_peer)
MetadataUpdater ->> RegionMigration: openRegionGuard()
Note over RegionMigration: Continue migration...
In the forest deep and vast,
Migration tasks are done at last.
Failure detectors, registered with care,
Deregistered now, no burdens to bear.
Control flows adjusted, all’s in sync,
Our regions migrate, smooth as a blink.
The code now hums, like a summer's tune,
A dance of bytes, under the moon. 🌕✨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Attention: Patch coverage is 90.00000%
with 1 line
in your changes missing coverage. Please review.
Project coverage is 84.88%. Comparing base (
b1219fa
) to head (9ec0344
). Report is 5 commits behind head on main.
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your
The original failure detectors of failed region was removed once the procedure was triggered. However, the
from_peer
may still send the heartbeats contains the failed region. We need to remove it manually to reduce false positive rate of failure detection.Checklist
Summary by CodeRabbit
New Features
Bug Fixes