uber / cadence

Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.
https://cadenceworkflow.io
MIT License
7.97k stars 773 forks source link

Fix rebalancing failover tooling #6095

Closed fimanishi closed 3 weeks ago

fimanishi commented 4 weeks ago

What changed? Excluded domains that have preferredClusters that are not present in its cluster list in ReplicationConfiguration

Why? Domains with preferredClusters not in their cluster list were causing the workflow to panic when trying to acquire the client for the preferredCluster from the remoteFrontendClients

How did you test it? unit tests and local replication tests

Potential risks The workflow was generally not working before because having a preferredCluster not in the domain cluster list is a normal scenario. The risk introduced is the workflow actually executing and doing something unexpected. For now, we just rebalance domains that have preferredCluster set and that can be rebalanced back to their preferredCluster

Release notes

Documentation Changes

codecov[bot] commented 4 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 69.15%. Comparing base (f0f7efd) to head (394650a). Report is 11 commits behind head on master.

Additional details and impacted files | [Files](https://app.codecov.io/gh/uber/cadence/pull/6095?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber) | Coverage Δ | | |---|---|---| | [...rvice/worker/failovermanager/rebalance\_workflow.go](https://app.codecov.io/gh/uber/cadence/pull/6095?src=pr&el=tree&filepath=service%2Fworker%2Ffailovermanager%2Frebalance_workflow.go&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber#diff-c2VydmljZS93b3JrZXIvZmFpbG92ZXJtYW5hZ2VyL3JlYmFsYW5jZV93b3JrZmxvdy5nbw==) | `100.00% <100.00%> (ø)` | | ... and [15 files with indirect coverage changes](https://app.codecov.io/gh/uber/cadence/pull/6095/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber) ------ [Continue to review full report in Codecov by Sentry](https://app.codecov.io/gh/uber/cadence/pull/6095?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://app.codecov.io/gh/uber/cadence/pull/6095?dropdown=coverage&src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber). Last update [f0f7efd...394650a](https://app.codecov.io/gh/uber/cadence/pull/6095?dropdown=coverage&src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=uber).