fluxcd / flux2

Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
https://fluxcd.io
Apache License 2.0
6.41k stars 594 forks source link

Bootstrap fails, Reconciliation fails but works after some time #4811

Open slaecker opened 4 months ago

slaecker commented 4 months ago

Describe the bug

When bootstrapping flux 2.3.0 it sits at waiting for Kustomization "flux-system/flux-system" to be reconciled for some time and finally fails. On the cluster the Reconciliation on the kustomize-controller also fails for some time, returning connect: connection refused. After more than one hour it suddenly returns Reconciliation finished and everything starts to work normally.

Steps to reproduce

  1. Install flux cli 2.3.0 (I used the Arch Linux AUR package)
  2. Run the bootstrap against the K3s 1.29.4 cluster using the git command
  3. See the bootstrap failing after some time
  4. Check the kustomize-controller pod logs and see that Reconciliation fails
  5. Wait for some time
  6. See that after more than an hour Reconciliation finishes successfully and everything starts working normally

Expected behavior

I expected flux bootstrap to finish successfully and to start working on the cluster normally right away.

Screenshots and recordings

No response

OS / Distro

Arch Linux (fully upgraded)

Flux version

v2.3.0

Flux check

► checking prerequisites ✔ Kubernetes 1.29.4+k3s1 >=1.28.0-0 ► checking version in cluster ✔ distribution: flux-v2.3.0 ✔ bootstrapped: true ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v1.0.1 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.38.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.32.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v1.3.0 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v1.3.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v1.3.0 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta3 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1 ✔ helmreleases.helm.toolkit.fluxcd.io/v2 ✔ helmrepositories.source.toolkit.fluxcd.io/v1 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta2 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta3 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✔ all checks passed

Git provider

Codeberg

Container Registry provider

No response

Additional context

Bootstrap log:

flux bootstrap git \
  --url=ssh://git@codeberg.org/slaecker/fluxcd \
  --branch=main \
  --private-key-file=$HOME/.ssh/id_fluxcd \
  --password=********************* \
  --path=clusters/k3s-cluster-test \
  --components-extra image-reflector-controller,image-automation-controller
► cloning branch "main" from Git repository "ssh://git@codeberg.org/slaecker/fluxcd"
✔ cloned repository
► generating component manifests
✔ generated component manifests
✔ component manifests are up to date
► installing components in "flux-system" namespace
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
► generating source secret
✔ public key: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAUM5tZcPAXenEEXYJrdvibxzebfi3EkgXTPIQGOmz6n
Please give the key access to your repository: y
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
► generating sync manifests
✔ generated sync manifests
✔ sync manifests are up to date
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for GitRepository "flux-system/flux-system" to be reconciled
✔ GitRepository reconciled successfully
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ kustomization 'flux-system/flux-system' not ready: 'Reconciliation in progress'
► confirming components are healthy
✔ helm-controller: deployment ready
✔ image-automation-controller: deployment ready
✔ image-reflector-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s): error while waiting for Kustomization to be ready: 'kustomization 'flux-system/flux-system' not ready: 'Reconciliation in progress''

Kustomize-controller log:

{"level":"info","ts":"2024-05-24T10:07:53.143Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2024-05-24T10:07:53.143Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-05-24T10:07:53.144Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","ts":"2024-05-24T10:07:53.144Z","msg":"starting server","name":"health probe","addr":"[::]:9440"}
{"level":"info","ts":"2024-05-24T10:07:53.246Z","logger":"runtime","msg":"attempting to acquire leader lease flux-system/kustomize-controller-leader-election..."}
{"level":"info","ts":"2024-05-24T10:07:53.266Z","logger":"runtime","msg":"successfully acquired lease flux-system/kustomize-controller-leader-election"}
{"level":"info","ts":"2024-05-24T10:07:53.267Z","msg":"Starting EventSource","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","source":"kind source: *v1.Kustomization"}
{"level":"info","ts":"2024-05-24T10:07:53.267Z","msg":"Starting EventSource","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","source":"kind source: *v1beta2.OCIRepository"}
{"level":"info","ts":"2024-05-24T10:07:53.267Z","msg":"Starting EventSource","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","source":"kind source: *v1.GitRepository"}
{"level":"info","ts":"2024-05-24T10:07:53.267Z","msg":"Starting EventSource","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","source":"kind source: *v1beta2.Bucket"}
{"level":"info","ts":"2024-05-24T10:07:53.267Z","msg":"Starting Controller","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization"}
{"level":"info","ts":"2024-05-24T10:07:53.377Z","msg":"Starting workers","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","worker count":4}
{"level":"info","ts":"2024-05-24T10:07:53.395Z","logger":"KubeAPIWarningLogger","msg":"metadata.finalizers: \"finalizers.fluxcd.io\": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers"}
{"level":"info","ts":"2024-05-24T10:07:54.147Z","msg":"Source artifact not found, retrying in 30s","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"e56e295a-9abe-4cd8-8768-8391647f3d84"}
{"level":"info","ts":"2024-05-24T10:08:21.074Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:08:26.077Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:08:36.080Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:08:56.083Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:09:26.087Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:09:56.090Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:10:26.095Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:10:56.099Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:11:26.102Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"info","ts":"2024-05-24T10:11:56.107Z","msg":"request failed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","error":"Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused","method":"GET","url":"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz"}
{"level":"error","ts":"2024-05-24T10:11:56.107Z","msg":"Reconciliation failed after 3m35.064839979s, next try in 10m0s","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"5edd4a48-c4f4-4c6b-8c25-bedd346af378","revision":"main@sha1:c7313dec48a666d0386d25ef24226e84befb1287","error":"failed to download archive: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused"}
{"level":"error","ts":"2024-05-24T10:12:16.125Z","msg":"unable to record event","name":"flux-system","namespace":"flux-system","reconciler kind":"Kustomization","error":"POST http://notification-controller.flux-system.svc.cluster.local./ giving up after 5 attempt(s): Post \"http://notification-controller.flux-system.svc.cluster.local./\": dial tcp 10.43.161.81:80: connect: connection refused"}
...
{"level":"error","ts":"2024-05-24T11:20:51.715Z","msg":"Reconciliation failed after 3m35.095699402s, next try in 10m0s","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"9e4030eb-e28e-420e-98cb-709c2a69606e","revision":"main@sha1:c7313dec48a666d0386d25ef24226e84befb1287","error":"failed to download archive: GET http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz giving up after 10 attempt(s): Get \"http://source-controller.flux-system.svc.cluster.local./gitrepository/flux-system/flux-system/c7313dec48a666d0386d25ef24226e84befb1287.tar.gz\": dial tcp 10.43.89.232:80: connect: connection refused"}
{"level":"info","ts":"2024-05-24T11:30:54.613Z","msg":"server-side apply for cluster definitions completed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"630d02bb-c59e-4d0b-ad73-95d3d9dba438","output":{"CustomResourceDefinition/alerts.notification.toolkit.fluxcd.io":"configured","CustomResourceDefinition/buckets.source.toolkit.fluxcd.io":"configured","CustomResourceDefinition/gitrepositories.source.toolkit.fluxcd.io":"configured","CustomResourceDefinition/helmcharts.source.toolkit.fluxcd.io":"configured","CustomResourceDefinition/helmreleases.helm.toolkit.fluxcd.io":"configured","CustomResourceDefinition/helmrepositories.source.toolkit.fluxcd.io":"configured","CustomResourceDefinition/imagepolicies.image.toolkit.fluxcd.io":"configured","CustomResourceDefinition/imagerepositories.image.toolkit.fluxcd.io":"configured","CustomResourceDefinition/imageupdateautomations.image.toolkit.fluxcd.io":"configured","CustomResourceDefinition/kustomizations.kustomize.toolkit.fluxcd.io":"configured","CustomResourceDefinition/ocirepositories.source.toolkit.fluxcd.io":"configured","CustomResourceDefinition/providers.notification.toolkit.fluxcd.io":"configured","CustomResourceDefinition/receivers.notification.toolkit.fluxcd.io":"configured","Namespace/flux-system":"configured"}}
{"level":"info","ts":"2024-05-24T11:30:55.549Z","msg":"server-side apply completed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"630d02bb-c59e-4d0b-ad73-95d3d9dba438","output":{"ClusterRole/crd-controller-flux-system":"configured","ClusterRole/flux-edit-flux-system":"configured","ClusterRole/flux-view-flux-system":"configured","ClusterRoleBinding/cluster-reconciler-flux-system":"configured","ClusterRoleBinding/crd-controller-flux-system":"configured","Deployment/flux-system/helm-controller":"configured","Deployment/flux-system/image-automation-controller":"configured","Deployment/flux-system/image-reflector-controller":"configured","Deployment/flux-system/kustomize-controller":"configured","Deployment/flux-system/notification-controller":"configured","Deployment/flux-system/source-controller":"configured","GitRepository/flux-system/flux-system":"configured","Kustomization/flux-system/apps":"created","Kustomization/flux-system/flux-system":"configured","Kustomization/flux-system/infra-configs":"created","Kustomization/flux-system/infra-controllers":"created","NetworkPolicy/flux-system/allow-egress":"configured","NetworkPolicy/flux-system/allow-scraping":"configured","NetworkPolicy/flux-system/allow-webhooks":"configured","ResourceQuota/flux-system/critical-pods-flux-system":"configured","Service/flux-system/notification-controller":"configured","Service/flux-system/source-controller":"configured","Service/flux-system/webhook-receiver":"configured","ServiceAccount/flux-system/helm-controller":"configured","ServiceAccount/flux-system/image-automation-controller":"configured","ServiceAccount/flux-system/image-reflector-controller":"configured","ServiceAccount/flux-system/kustomize-controller":"configured","ServiceAccount/flux-system/notification-controller":"configured","ServiceAccount/flux-system/source-controller":"configured"},"revision":"main@sha1:c7313dec48a666d0386d25ef24226e84befb1287"}
{"level":"info","ts":"2024-05-24T11:30:55.622Z","msg":"Reconciliation finished in 3.865901445s, next run in 10m0s","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"630d02bb-c59e-4d0b-ad73-95d3d9dba438","revision":"main@sha1:c7313dec48a666d0386d25ef24226e84befb1287"}

Code of Conduct

stefanprodan commented 4 months ago

The dial tcp 10.43.89.232:80: connect: connection refused is not something we can fix in Flux. This is usually caused by CNI/CoreDNS issues.