fluxcd / flux2

Open and extensible continuous delivery solution for Kubernetes. Powered by GitOps Toolkit.
Apache License 2.0
6.48k stars 601 forks source link

flux bootstrap fails kustomization reconciliation step #4194

Closed bgiegel closed 6 months ago

bgiegel commented 1 year ago

Describe the bug

I’m new to flux. I’m just trying to do the getting started guide. And I’m stuck at the very beginning.

I’ve ran the following command:

export GITHUB_USER..
export GITHUB_TOKEN...

flux bootstrap github \
  --owner=$GITHUB_USER \
  --repository=tg-test \
  --branch=master \
  --path=./clusters/sks-int \
  --personal \

and here is the output : 

► connecting to github.com
► cloning branch "master" from Git repository "https://github.com/bgiegel/tg-test.git"
✔ cloned repository
► generating component manifests
✔ generated component manifests
✔ component manifests are up to date
► installing components in "flux-system" namespace
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
✔ source secret up to date
► generating sync manifests
✔ generated sync manifests
✔ sync manifests are up to date
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ client rate limiter Wait returned an error: context deadline exceeded
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s)

I checked what is deployed on my k8s cluster and I can see all the necessary components deployed :

helm-controller-6c55d4d49c-xfpc4 1/1 Running 0 12m kustomize-controller-859c949c64-n9jlk 1/1 Running 0 12m notification-controller-7d7747dd84-8pc76 1/1 Running 0 12m source-controller-797866b5c6-c8zxh 1/1 Running 0 10m

But if I look in the logs I found this error in the source controller :

{ "level": "error", "ts": "2023-08-27T10:04:41.621Z", "msg": "failed to configure authentication options: failed to get secret 'flux-system/flux-system': secrets \"flux-system\" not found", "controller": "gitrepository", "controllerGroup": "source.toolkit.fluxcd.io", "controllerKind": "GitRepository", "GitRepository": { "name": "flux-system", "namespace": "flux-system" }, "namespace": "flux-system", "name": "flux-system", "reconcileID": "0174b2c4-d30c-45e6-a524-904aae2bbdc3", "error": "failed to configure authentication options: failed to get secret 'flux-system/flux-system': secrets \"flux-system\" not found" } { "level": "error", "ts": "2023-08-27T10:04:56.649Z", "msg": "unable to record event", "name": "flux-system", "namespace": "flux-system", "reconciler kind": "GitRepository", "error": "POST http://notification-controller.flux-system.svc.cluster.local./ giving up after 5 attempt(s): Post \"http://notification-controller.flux-system.svc.cluster.local./\": dial tcp: lookup notification-controller.flux-system.svc.cluster.local. on no such host" } { "level": "error", "ts": "2023-08-27T10:04:56.664Z", "msg": "Reconciler error", "controller": "gitrepository", "controllerGroup": "source.toolkit.fluxcd.io", "controllerKind": "GitRepository", "GitRepository": { "name": "flux-system", "namespace": "flux-system" }, "namespace": "flux-system", "name": "flux-system", "reconcileID": "0174b2c4-d30c-45e6-a524-904aae2bbdc3", "error": "failed to configure authentication options: failed to get secret 'flux-system/flux-system': secrets \"flux-system\" not found" } { "level": "error", "ts": "2023-08-27T10:05:14.660Z", "msg": "unable to record event", "name": "flux-system", "namespace": "flux-system", "reconciler kind": "GitRepository", "error": "POST http://notification-controller.flux-system.svc.cluster.local./ giving up after 5 attempt(s): Post \"http://notification-controller.flux-system.svc.cluster.local./\": dial tcp: lookup notification-controller.flux-system.svc.cluster.local. on no such host" }

I checked and the secret exist :

flux-system Opaque 3 13m

So I tried to restart the source-controller pod. And no error after a restart. But still the command flux boostrap continues to fail. Even I I’m trying to re run it after the source controller restart.

Steps to reproduce

kubectl version

Client Version: v1.26.0 Kustomize Version: v4.5.7 Server Version: v1.26.7

flux version

flux: v2.1.0 helm-controller: v0.36.0 kustomize-controller: v1.1.0 notification-controller: v1.1.0 source-controller: v1.1.0

Run the command :

flux bootstrap github \ --owner=$GITHUB_USER \ --repository=tg-test \ --branch=master \ --path=./clusters/sks-int \ --personal \

Github repository is created successfully and all component deployed but command never ends

Expected behavior

I run the command it finishes without error no error in the logs of any component.

Screenshots and recordings

No response

OS / Distro

Mac OS Ventura 13.5.1

Flux version


Flux check

► checking prerequisites ✔ Kubernetes 1.26.7 >=1.25.0-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.36.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v1.1.0 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v1.1.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v1.1.0 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta2 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta2 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta2 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✔ all checks passed

Git provider


Container Registry provider

No response

Additional context

No response

Code of Conduct

zelogik commented 1 year ago

Hello, got the same problem, have you found a solution?

Seem like a coredns misconfiguration but got:

kubectl exec -ti busybox -- nslookup gitlab.com
Address 1: kube-dns.kube-system.svc.cluster.local

Name:      gitlab.com
Address 1: 2606:4700:90:0:f22e:fbec:5bed:a9b9
Address 2:
kubectl exec -ti busybox -- nslookup kubernetes
Address 1: kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: kubernetes.default.svc.cluster.local

playing with DNS/coreDNS/resolv.conf and got: ( or the same error as you...)

tls: failed to verify certificate: x509: certificate is valid for "xxxxxx my internal DNS/reverse IP?", not gitlab.com
tls: failed to verify certificate: x509: certificate is valid for "xxx cloudflare DNS", not gitlab.com
chinmay90kulkarni commented 1 year ago

Is this issue resolved? I too getting same issue.. ! :(

zelogik commented 1 year ago

Not resolved on my side, haven´t got further, thinking about changing gitops tool, or when got time try with k3s and not a k8s cluster.

Maybe flannel/calico problem, or CodeDNS even if everything working except fluxCD

jtorrex commented 1 year ago

Same issue here!

stefanprodan commented 6 months ago

A dial tcp: lookup error can be due to CNI or DNS miss configuration, there is nothing we can in Flux about it.

Antik9421 commented 4 months ago

I found this problem. I used diffуrent github account, where i not used admin access Flux not create secret key in github account

✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
► generating source secret
✔ public key: ecdsa-sha2-nistp384 ******
✗ multiple errors occurred: 
- POST https://api.github.com/repos/<user>/flux-infra/keys: 404 Not Found []
- the requested resource was not found