Open Icarus9913 opened 3 weeks ago
Would that be something like:
https://kuma.io/docs/dev/policies/meshcircuitbreaker/
outlierDetection:
panicDisabled: true
interval: 5s
baseEjectionTime: 30s
maxEjectionPercent: 20
splitExternalAndLocalErrors: true
?
Also from looking at the docs I see no mention of panic, we should talk about it (https://github.com/kumahq/kuma-website/issues/1953).
Is there any possible workaround to set configuration in a way that will avoid entering panic?
Is there any possible workaround to set configuration in a way that will avoid entering panic?
During load balancing, Envoy will generally only consider available (healthy or degraded) hosts in an upstream cluster. However, if the percentage of available hosts in the cluster becomes too low, Envoy will disregard health status and balance either amongst all hosts or no hosts.
If you want to avoid entering panic mode, which means the upstream clusters keep in the healthy state and do not turn into unhealth. I guess maybe we just need to disable the Outlier detection ? And the upstream cluster would keep in health even they returned lots of failures.
For now, I can just only use the MeshProxyPatch to disable the enovy-panic-threshold
apiVersion: kuma.io/v1alpha1
kind: MeshProxyPatch
metadata:
name: circuit-breaker
namespace: kuma-system
spec:
default:
appendModifications:
- cluster:
jsonPatches:
- op: add
path: /common_lb_config
value:
healthy_panic_threshold: {}
match:
name: httpbin-1_kuma-demo_svc_8000
operation: Patch
targetRef:
kind: Mesh
Do we add envoy panic threshold by default even when you don't use MeshCircuitBreaker? @Icarus9913
I think we have it in MeshHealthCheck policy https://github.com/kumahq/kuma/blob/4d9ec2b5f795006a3e271fbf9a5113ebf0706f46/pkg/plugins/policies/meshhealthcheck/api/v1alpha1/meshhealthcheck.go#L61
@jakubdyszkiewicz I believe it's enabled by default in LoadBalancing of Envoy. And I was wondering if we could implement something like what Ilya put above.
Maybe it's more proper to add a property HealthyPanicThresholdPercent *intstr.IntOrString
like we did for the MeshHealthCheck
policy rather than just use a bool
type to enable/disable it, which lifts the restriction for the user to edit this property and write some explanation for the panic threshold
. (If the user doesn't set it, the envoy would treat it with the default value 50%
)
We should also check the Istio's CircuitBreaker policy properties to check the differences and its source codes implementations.
Description
I noticed that there's an implementation difference between our MeshCircuitBreaker and Istio's (https://github.com/istio/istio/blob/7b2d8b9dacb7989d75b3b3d860988cadb9f5a848/pilot/pkg/networking/core/cluster_traffic_policy.go#L416-L423), maybe we could add one flag to disable panic threshold and it depends on the user's choice.