3scale-ops / saas-operator

3scale SaaS Operator - www.3scale.net
Apache License 2.0
8 stars 2 forks source link

Avoid failover delays in twemproxy reconfig when using master targets #277

Closed roivaz closed 1 year ago

roivaz commented 1 year ago

It seems that even though the "sentinel master " command returns updated information about the master as soon as a slave is promoted, there is an exception with the sentinel instance that acts as "failover leader". The failover leader is the instance that actually performs the shard reconfigurations, and in this case, its "sentinel master " command only returns updated information when all the slaves have been also reconfigured to point to the new master. This causes delays in twemproxy reconfiguration if this is the instance used by the twemproxyconfig controller to "discover" the shard. To detect this situation, add a step that checks the master address with the command "sentinel get-master-addr-by-name ". This command always returns updated master information, even if we are querying the leader sentinel instance.

This must be merged before #276 as it should be included in the new release.

/kind bug /priority important-soon /assign

3scale-robot commented 1 year ago

LGTM label has been added.

Git tree hash: 578f6ba3019e17ab6a204433a2579a70b46a1fdf

roivaz commented 1 year ago

/approve

3scale-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: roivaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/3scale-ops/saas-operator/blob/main/OWNERS)~~ [roivaz] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
roivaz commented 1 year ago

Related to https://github.com/3scale/platform/issues/868