Open gladiatr72 opened 2 years ago
crickets...
Hey @gladiatr72, have you noticed anything else in the logs in terms of errors?
My first guess would be to check the resource limits (CPU and Memory) and resource saturation, as I have seen they causing the controller to misbehave without a proper reason - or reasonable error messages. Are you using the defaults values?
Would you also be able to share what type of sources (and quantity) you have configured in your setup?
Sure
⇶ k get deployments.apps -n flux-system source-controller -o jsonpath={.spec.template.spec.containers[0].resources} | jq -M .
{
"limits": {
"cpu": "1",
"memory": "1Gi"
},
"requests": {
"cpu": "50m",
"memory": "64Mi"
}
}
The controller is not misbehaving. From the referenced ticket it is working as intended.
⇶ k get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
charts 2/2 2 2 153d
helm-controller 2/2 2 2 154d
kustomize-controller 2/2 2 2 154d
notification-controller 2/2 2 2 154d
source-controller 1/2 2 1 154d
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
The issue is in regards to the fluxv2 source controller being the only fluxv2 controller that uses a failed
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
readiness check to manage
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-84dcf54c8-xlq4h -- Readiness probe failed: Get "http://10.64.177.148:9090/": dial tcp 10.64.177.148:9090: connect: connection refused
leadership; whereas the kustomize, helm and notification controllers' secondary pods con ready, play a quick leadership game then slip into a passive state until the next election.
I understand why those bits might not have been added to the source-controller yet, but #326 leaves one with the impression that it is not being considered as a thing that needs doing.
The other part of what you asked for: current environment has 3 git sources and 1 helm repository with ~2 dozen charts
If we would allow for standby pods to become ready, then consumers like kustomize and helm controller will not be able to fetch the source artifacts, standby pods don’t replicate the storage from the primary but kube proxy will randomly route calls to them.
in reference to: https://github.com/fluxcd/source-controller/issues/326
Ok, but source-controller is the only flux component that actually does this. I thought that was the whole purpose of the leader election thing...
Personally, I don't care about the prom alerts. Those can be silenced. Filling up the event log with reams of
flux-system -- 0s -- Warning -- Unhealthy -- pod/source-controller-579b5cc8c9-2mvw8 -- Readiness probe failed: Get "http://10.64.183.82:9090/": dial tcp 10.64.183.82:9090: connect:
; however, is not. (particularly on managed clusters where event retention is non-configurable (lifespan of a mayfly in a hurricane))helm-controller initial logs:
dxg6n
mcjk4