Closed argamanza closed 2 years ago
This is a bug in k3s, hopefully it will be fixed once k3s catches up with Kubernetes 1.21
brew upgrade flux
resolved it for me. I guess it has been resolved with the version 0.13.2 (as per brew installations). Not sure whether 2 different kind of versioning out there for flux.
I used brew install fluxcd/tap/flux
for the installations in mac.
Is there a workaround _- I'm seeing it on 1.19.5+k3s2
but I can't upgrade due to other components not supporting higher verisons
Upgraded to 1.19.10+k3s1 (as far as I can go) and still not working - odd as people above seem to have had luck?
Scratch that - k3os with k3s 1.1.19.10 works :)
I am seeing a similar issue bootstrapping flux on a k3s 1.21.0 cluster running on raspberry PIs running Ubuntu Server 20 (and 21) 64bits.
What I get is gitrepository/flux-system is waiting to be reconciled. I bootstrapped from a machine remote from the cluster and github successfully added the ssh key. Unfortunately, it doesn't seem that the key got used by the cluster to be able to read the repo. Is this an indication that the cluster cannot get to github? I can ping from the nodes but I suspect it's internal networking to the cluster that cannot reach out... Any idea on resolving this?
it doesn't seem that the key got used by the cluster to be able to read the repo
Can you be more explicit? What’s the status of you sources? Please post here:
flux get all
flux logs --level=error
flux get all
NAME READY MESSAGE REVISION SUSPENDED
gitrepository/flux-system False waiting to be reconciled False
NAME READY MESSAGE REVISION SUSPENDED
kustomization/flux-system False Source is not ready, artifact not found False
Logs at level = error did not return anything. This did:
flux logs
2021-06-19T04:53:33.301Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2021-06-19T05:03:33.341Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2021-06-19T05:13:33.376Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2021-06-19T05:23:33.421Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2021-06-19T05:33:33.448Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2021-06-19T05:43:33.497Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
Doing a check gives me this:
flux check
► checking prerequisites
✗ flux 0.15.0 <0.15.2 (new version is available, please upgrade)
✔ kubectl 1.21.1 >=1.18.0-0
✔ Kubernetes 1.21.1+k3s1 >=1.16.0-0
► checking controllers
✗ source-controller: deployment not ready
► ghcr.io/fluxcd/source-controller:v0.14.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.13.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.15.0
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.11.0
source-controller: deployment not ready
The controller that does Git operations is crashing on your cluster, I guess you’re using ARM64, you need to upgrade to flux 0.15.2 to fix the crash loop.
Updating did resolve the crashloop. I do get networking errors:
flux logs --level=error
2021-06-19T05:54:13.586Z error GitRepository/flux-system.flux-system - unable to send event POST http://notification-controller/ giving up after 5 attempt(s): Post "http://notification-controller/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021-06-19T05:54:13.587Z error GitRepository/flux-system.flux-system - Reconciler error unable to clone 'ssh://git@github.com/jpconstantineau/flux-homelab', error: dial tcp: lookup github.com on 10.43.0.10:53: read udp 10.42.3.8:55144->10.43.0.10:53: i/o timeout
2021-06-19T05:55:48.815Z error GitRepository/flux-system.flux-system - unable to send event POST http://notification-controller/ giving up after 5 attempt(s): Post "http://notification-controller/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021-06-19T05:55:48.815Z error GitRepository/flux-system.flux-system - Reconciler error unable to clone 'ssh://git@github.com/jpconstantineau/flux-homelab', error: dial tcp: lookup github.com on 10.43.0.10:53: read udp 10.42.3.8:45960->10.43.0.10:53: i/o timeout
2021-06-19T05:57:23.936Z error GitRepository/flux-system.flux-system - unable to send event POST http://notification-controller/ giving up after 5 attempt(s): Post "http://notification-controller/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021-06-19T05:57:23.936Z error GitRepository/flux-system.flux-system - Reconciler error unable to clone 'ssh://git@github.com/jpconstantineau/flux-homelab', error: dial tcp: lookup github.com on 10.43.0.10:53: read udp 10.42.3.8:40379->10.43.0.10:53: i/o timeout
2021-06-19T05:58:59.061Z error GitRepository/flux-system.flux-system - unable to send event POST http://notification-controller/ giving up after 5 attempt(s): Post "http://notification-controller/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021-06-19T05:58:59.062Z error GitRepository/flux-system.flux-system - Reconciler error unable to clone 'ssh://git@github.com/jpconstantineau/flux-homelab', error: dial tcp: lookup github.com on 10.43.0.10:53: read udp 10.42.3.8:42335->10.43.0.10:53: i/o timeout
Guess now your k3s CNI or CoreDNS is broken… try bootstrap with --network-policy=false
, if that doesn’t work then you should investigate why the pods on your cluster can’t reach the dns.
I just tried re-running bootstrap with the --network-policy=false
option and same error messages came back. Looks like I'll need to look into CoreDNS and figure out how to get DNS outside the cluster... (configure the upstream name servers).
Try to remove old npc manually at first
$ kubectl delete npc -n flux-system --all
On Sat, Jun 19, 2021, 09:24 Pierre Constantineau @.***> wrote:
I just tried re-running bootstrap with the --network-policy=false option and same error messages came back. Looks like I'll need to look into CoreDNS and figure out how to get DNS outside the cluster... (configure the upstream name servers).
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fluxcd/flux2/issues/1344#issuecomment-864363584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAF2LDQWQ5MBPPGUQGLLDTTQZ2TANCNFSM43RJ5PMQ .
K3s uses flannel as CNI, which does not support npc at all.
On Sat, Jun 19, 2021, 09:10 Stefan Prodan @.***> wrote:
Guess now your k3s CNI or CoreDNS is broken… try bootstrap with --network-policy=false, if that doesn’t work then you should investigate why pods can’t access reach the dns.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fluxcd/flux2/issues/1344#issuecomment-864362422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAF2LTUGNEKJDGQB7JCWDTTQYFBANCNFSM43RJ5PMQ .
I'll give it a try then see if that works. If so, I'll rebuild the test cluster (there is nothing in it) with the network policy turned off.
I'll report back as I am sure others in a similar situation will benefit...
Quick update: Latest release resolved the crash on dns lookup failure. Thanks for that!
I then looked at troubleshooting why my cluster was having DNS problems. The instructions at Rancher were helpful in testing whether specific host and cluster setup steps were problematic. Every time I was renaming the hostname from the raspberry pi image for Ubuntu 21, there were DNS problems.
I am not exactly sure what resolved the issue but upgrading to the latest K3S released 3 days ago fixed the issue.
Closing this as it seems resolved upstream in k3s.
Cannot get this to work in kind. Seeing same issues ✔ reconciled sync configuration ◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ client rate limiter Wait returned an error: context deadline exceeded ► confirming components are healthy ✗ helm-controller: deployment not ready ✗ kustomize-controller: deployment not ready ✗ notification-controller: deployment not ready ✗ source-controller: deployment not ready ✗ bootstrap failed with 2 health check failure(s)
flux --version flux version 0.31.0
We use Kubernetes Kind for all our e2e testing, we can't release Flux if those fail. To see why it fails for you, inspect the pods.
Worked after recreating the kind cluster - thanks
I have a k3s cluster working on a Raspberry Pi connected to my home local network. Tried to bootstrap a new GOTK repo using the following command:
The output for the bootstrapping command (notice the "context deadline exceeded" after "waiting for Kustomization "flux-system/flux-system" to be reconciled"):
The logs for the Kustomize Controller expose what the issue might be:
From the logs I can tell that status.snapshot.entries.namespace shouldn't be null for the
flux-system
kustomization, and after testing the same bootstrap procedure on a local machine using cluster I provisioned usingkind
I can see that the kustomization indeed miss thestatus.snapshot
data in the K3S cluster while on my localkind
cluster it exists:K3S@RaspberryPi:
kind@local:
This is also where my debugging process came to a dead end as I couldn't find a reason why the
status.snapshot
doesn't populate on my K3S@RaspberryPi while it does on Kind@Local using the same bootstrap process.I believe the fact that the issue only occurs on my raspberry pi implies that it might be a networking issue of some kind that prevents the kustomize controller from getting status updates from GitHub and I need to handle port forwarding or something similar, but I'm not sure.