Closed bpinter closed 4 years ago
If you're able to, can you try the image fluxcd/flux-prerelease:master-64092ddd
, which (I allege) has a fix for this. Hence or otherwise, I believe this is caused by supplying a --git-path
argument that doesn't correspond to a directory in the repo, so until there's a released fix, you could look into that.
Fixing a typo in a --git-path
directory name resolved this issue for me.
This appears to still be an issue even with that prerelease.
└─[$] kubectl describe -n flux deploy flux [0:22:42]
Name: flux
Namespace: flux
CreationTimestamp: Sat, 18 Jul 2020 22:15:37 -0500
Labels: app=flux
app.kubernetes.io/managed-by=Helm
chart=flux-1.4.0
heritage=Helm
release=flux
Annotations: deployment.kubernetes.io/revision: 6
meta.helm.sh/release-name: flux
meta.helm.sh/release-namespace: flux
Selector: app=flux,release=flux
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=flux
release=flux
Service Account: flux
Containers:
flux:
Image: docker.io/fluxcd/flux-prerelease:master-64092ddd
Port: 3030/TCP
Host Port: 0/TCP
Args:
--log-format=fmt
--ssh-keygen-dir=/var/fluxd/keygen
--ssh-keygen-format=RFC4716
--k8s-secret-name=flux-git-deploy
--memcached-hostname=flux-memcached
--sync-state=secret
--memcached-service=
--git-url=git@github.com:dataplex/gitops-istio
--git-branch=master
--git-path=flux_root
--git-readonly=false
--git-user=Weave Flux
--git-email=support@weave.works
--git-verify-signatures=false
--git-set-author=false
--git-poll-interval=1m
--git-timeout=20s
--sync-interval=1m
--git-ci-skip=false
--automation-interval=1m
--registry-rps=200
--registry-burst=125
--registry-trace=false
--sync-garbage-collection=true
Requests:
cpu: 50m
memory: 64Mi
Liveness: http-get http://:3030/api/flux/v6/identity.pub delay=5s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get http://:3030/api/flux/v6/identity.pub delay=5s timeout=5s period=10s #success=1 #failure=3
Environment:
KUBECONFIG: /root/.kubectl/config
Mounts:
/etc/fluxd/ssh from git-key (ro)
/root/.kubectl from kubedir (rw)
/var/fluxd/keygen from git-keygen (rw)
Volumes:
kubedir:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: flux-kube-config
Optional: false
git-key:
Type: Secret (a volume populated by a Secret)
SecretName: flux-git-deploy
Optional: false
git-keygen:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: flux-7dc59879dd (1/1 replicas created)
NewReplicaSet: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 47m deployment-controller Scaled down replica set flux-6866668986 to 0
Normal ScalingReplicaSet 39m (x2 over 48m) deployment-controller Scaled up replica set flux-9cd77b5b to 1
Normal ScalingReplicaSet 35m deployment-controller Scaled up replica set flux-bf6895777 to 1
Normal ScalingReplicaSet 35m (x2 over 40m) deployment-controller Scaled down replica set flux-9cd77b5b to 0
Normal ScalingReplicaSet 23m deployment-controller Scaled up replica set flux-6f46d695d7 to 1
Normal ScalingReplicaSet 11m deployment-controller Scaled down replica set flux-bf6895777 to 0
Normal ScalingReplicaSet 9m37s deployment-controller Scaled up replica set flux-7dc59879dd to 1
Normal ScalingReplicaSet 9m21s deployment-controller Scaled down replica set flux-6f46d695d7 to 0
Flag --git-verify-signatures has been deprecated, changed to --git-verify-signatures-mode, use that instead
ts=2020-07-19T05:13:28.366952151Z caller=main.go:259 version=master-64092ddd
ts=2020-07-19T05:13:28.366995671Z caller=main.go:412 msg="using kube config: \"/root/.kube/config\" to connect to the cluster"
ts=2020-07-19T05:13:28.384131745Z caller=main.go:492 component=cluster identity=/etc/fluxd/ssh/identity
ts=2020-07-19T05:13:28.384168868Z caller=main.go:493 component=cluster identity.pub="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCoBxvgesyv49+sBXGTpBvWzNDr9+jJMNLnI224knBJPYHZTgqjfSEKk2BrFlHkD7PqppXYYE4+Ei9G4EwPNfUFbCX2KFSWWHn8KFvp/utV1YwlCFXTQkKkzJBH33UaJVIrfJNcaZS+z0NxxECCJmUEEQiPqsZjuOINwSEg3Q5CpW+cHrFYzBl251U4PqO2y7Dly8mH4LqqBYgEGyYBCTVOaJuUAWR8Ru1lRDsro12ZoznjRR9IGBzycrOcqBz4BXb3th2jwtmu/x+hDQ1lABbwPD+fhDD5S5Ls2SKFNH8JV4p2OJALvC90pLVOyDTMbYQysKtxKp4aMndaXIvqWKjDQB4t6PTwgXMpCjrlQmzGi2ajliuy0P3u+FY3ihL2yYiBKNzuq6TP/C9dO/dNfo5d0lPPIMeYOoi/qGPtjhTzDDaVacSPyp3f8wR06KA96gI+B34kadCbB/GETniqP++ybdtU6qrOEX68rIkKCxoXJNDEK1KfVxGJeiTqAHYyIwE= root@flux-6675c954d4-zdwst"
ts=2020-07-19T05:13:28.384206014Z caller=main.go:498 host=https://10.100.0.1:443 version=kubernetes-v1.16.8-eks-fd1ea7
ts=2020-07-19T05:13:28.384265748Z caller=main.go:510 kubectl=/usr/local/bin/kubectl
ts=2020-07-19T05:13:28.384968602Z caller=main.go:527 ping=true
ts=2020-07-19T05:13:28.385846213Z caller=main.go:666 url=ssh://git@github.com/dataplex/gitops-istio user="Weave Flux" email=support@weave.works signing-key= verify-signatures-mode=none sync-tag=flux-sync state=secret readonly=false registry-disable-scanning=false notes-ref=flux set-author=false git-secret=false sops=false
ts=2020-07-19T05:13:28.390140296Z caller=main.go:772 upstream="no upstream URL given"
ts=2020-07-19T05:13:28.3906428Z caller=main.go:795 addr=:3030
ts=2020-07-19T05:13:28.393671662Z caller=loop.go:108 component=sync-loop err="loading last-synced resources: reading the repository checkout: cloning repo: git repo not ready: git repo has not been cloned yet"
ts=2020-07-19T05:13:28.393713822Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
ts=2020-07-19T05:13:28.393724148Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2020-07-19T05:13:28.463752003Z caller=aws.go:151 component=aws info="detected cluster region" source="EC2 metadata service" region=us-east-2
ts=2020-07-19T05:13:28.463788108Z caller=aws.go:117 component=aws info="restricting ECR registry scans" regions=[us-east-2] include-ids=[] exclude-ids="[602401143452 918309763551]"
ts=2020-07-19T05:13:28.713510395Z caller=checkpoint.go:24 component=checkpoint msg="up to date" latest=1.20.0
ts=2020-07-19T05:13:30.560451503Z caller=warming.go:198 component=warmer info="refreshing image" image=docker.io/fluxcd/flux-prerelease tag_count=415 to_update=415 of_which_refresh=0 of_which_missing=415
ts=2020-07-19T05:13:35.61563336Z caller=loop.go:134 component=sync-loop event=refreshed url=ssh://git@github.com/dataplex/gitops-istio branch=master HEAD=4bed6299a21fc9f474343758f2c8bcf929813103
ts=2020-07-19T05:13:35.658483698Z caller=loop.go:108 component=sync-loop err="loading last-synced resources: loading resources from repo: unable to read root path \"/tmp/flux-working722538269/flux_root\": stat /tmp/flux-working722538269/flux_root: no such file or directory"
ts=2020-07-19T05:13:36.299077816Z caller=warming.go:206 component=warmer updated=docker.io/fluxcd/flux-prerelease successful=415 attempted=415
ts=2020-07-19T05:13:36.299189942Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
ts=2020-07-19T05:13:36.575650737Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2020-07-19T05:14:36.048166347Z caller=loop.go:108 component=sync-loop err="loading last-synced resources: loading resources from repo: unable to read root path \"/tmp/flux-working227094871/flux_root\": stat /tmp/flux-working227094871/flux_root: no such file or directory"
ts=2020-07-19T05:14:36.049232335Z caller=loop.go:134 component=sync-loop event=refreshed url=ssh://git@github.com/dataplex/gitops-istio branch=master HEAD=4bed6299a21fc9f474343758f2c8bcf929813103
ts=2020-07-19T05:14:36.575809494Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
ts=2020-07-19T05:14:36.825617767Z caller=images.go:27 component=sync-loop msg="no automated workloads"
last commit to my repo I moved files from the root directory into a subdirectory because fluxcd was scanning other yaml files I have in there.
commit 4bed6299a21fc9f474343758f2c8bcf929813103 (HEAD -> master, origin/master, origin/HEAD)
Author: Benjamin Floyd <benjamin.floyd@cyberark.com>
Date: Sat Jul 18 23:32:15 2020 -0500
Big change coming with this...moving flux monitoring to a flux_root directory
diff --git a/cert-manager/cert-manager.crds.yaml b/flux_root/cert-manager/cert-manager.crds.yaml
similarity index 100%
rename from cert-manager/cert-manager.crds.yaml
rename to flux_root/cert-manager/cert-manager.crds.yaml
diff --git a/flagger/flagger-crds.yaml b/flux_root/flagger/flagger-crds.yaml
similarity index 100%
rename from flagger/flagger-crds.yaml
rename to flux_root/flagger/flagger-crds.yaml
diff --git a/flagger/flagger-grafana.yaml b/flux_root/flagger/flagger-grafana.yaml
similarity index 100%
rename from flagger/flagger-grafana.yaml
rename to flux_root/flagger/flagger-grafana.yaml
diff --git a/flagger/flagger.yaml b/flux_root/flagger/flagger.yaml
similarity index 100%
I am also getting the same issue on a rancher RKE cluster. My directory structure for the kubernetes manifest files are [repo]/gitops/namespaces,[repo]/gitops/releases/dev.
helm upgrade -i flux fluxcd/flux --wait --namespace fluxcd --set git.url=git@github.com:$github_username/$github_repo --set git.branch=dev-gitops --set git.path="/gitops/namespaces,/gitops/releases/dev"
After running the above command I am getting the same error as above.
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x176c94f]
goroutine 110 [running]: github.com/fluxcd/flux/pkg/cluster/kubernetes/resource.Load.func1(0xc0003da7b0, 0x2c, 0x0, 0x0, 0x20f5120, 0xc001907a70, 0x0, 0xc001906150) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/resource/load.go:32 +0x6f path/filepath.Walk(0xc0003da7b0, 0x2c, 0xc0003eb6b0, 0x0, 0x0) /usr/local/go/src/path/filepath/path.go:402 +0x6a github.com/fluxcd/flux/pkg/cluster/kubernetes/resource.Load(0xc003a5c860, 0x1a, 0xc00010c5c0, 0x1, 0x1, 0xc00010c500, 0xc000275230, 0xc003a5c860, 0xc0003eb790) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/resource/load.go:31 +0x19b github.com/fluxcd/flux/pkg/cluster/kubernetes.(manifests).LoadManifests(0xc000275230, 0xc003a5c860, 0x1a, 0xc00010c5c0, 0x1, 0x1, 0x42e0aa, 0x0, 0xc001880120) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/manifests.go:122 +0x67 github.com/fluxcd/flux/pkg/manifests.(rawFiles).GetAllResourcesByID(0xc0018ce1c0, 0x213dda0, 0xc000042080, 0xc001a28990, 0x28, 0x21257a0) /home/circleci/go/src/github.com/fluxcd/flux/pkg/manifests/rawfiles.go:96 +0x60 github.com/fluxcd/flux/pkg/daemon.(Daemon).getLastResources(0xc0002458c0, 0x213dda0, 0xc000042080, 0x21256a0, 0xc000044080, 0x0, 0x0, 0x0) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/sync.go:140 +0x176 github.com/fluxcd/flux/pkg/daemon.(Daemon).Sync(0xc0002458c0, 0x213dda0, 0xc000042080, 0x34604f6b, 0xed6a77432, 0x0, 0xc001a28810, 0x28, 0x21256a0, 0xc000044080, ...) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/sync.go:49 +0x82 github.com/fluxcd/flux/pkg/daemon.(*Daemon).Loop(0xc0002458c0, 0xc000092240, 0xc0004e7370, 0x20f2f20, 0xc0004cd470) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/loop.go:103 +0x525 created by main.main /home/circleci/go/src/github.com/fluxcd/flux/cmd/fluxd/main.go:777 +0x5990
I am getting the same error. After an hour of looking it at it, if I have more than one item in --git-path=
, then it throws this error. Having only 1 item here lets it work. I'm on EKS and GKE as well. All errors are the same as previously mentioned. Using 1.20.0
if that helps.
I have a main repo called flux
and inside of that I have a folder called clusters
. Inside of clusters
is a folder called global
(which I want to place things that should be installed in every cluster) and then folders by each cluster name where workloads/application yaml's will go.
flux
|__clusters
|__global
|__gke-cluster-1
|__eks-cluster-1
|__etc...
I was expecting the line - --git-path=clusters/global,clusters/eks-cluster-1
to work. Previously I was able to use two separate --git-path=
lines with each folder separately, but that no longer works either.
Any questions, happy to provide more details for replication.
I already tried with 1.20.0
and having the same issue
I was getting error even for a single path in GKE with 1.20.0
and setting no path it seems working, feeling bad because we are trying to go to production with this version
Been struggling with this.. Noticed this issue doesn't happen with 1.19.0
We noticed the same issue.
Supplying a --git-path
leads fluxd to crash each time it tries to sync in 1.20.0
In 1.19.0 it works as expected.
I can reproduce the panic in the original report, with flux 1.20.0, by supplying a path that doesn't exist in the repository.
However, the latter is not ideal either, since it will stop a sync proceeding. This is detailed in bug #3184 which is separate but interacts with this one: on startup, flux will try to construct the state of the last sync by looking at the repository at the high water mark (sync tag); if a path is missing, it'll either panic (due to this bug) or log an error and fail to sync (in the pre-release build).
@dataplex I think you're seeing #3184 -- not a panic, but no syncing either -- does that seem right to you?
@camorra-skk Try using paths without the leading slash, e.g., gitops/namespaces
-- you may just be hitting the "missing path -> panic" problem of the original report
@matthewbrahms I think you're getting the combo-bug, with the new path not being in the old revision, triggering a panic.
@gautamr @marratj Possibly you're seeing one of the above situations too. Same recommendation goes: try the pre-release, and check that the git-paths exist in the repo.
Thanks for the reports everyone. I think the fix will rest largely with mitigating #3184 so I'm going to concentrate there to start with.
The problem of a missing path causing a panic is fixed in #3193. See also #3223 for a related fix.
On AWS EKS, I've just tried to install the flux and helm operator. The flux pod starts then it dies immediately with the error: panic: runtime error: invalid memory address or nil pointer dereference
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behaviour:
helm upgrade -i flux \ --set image.pullSecret=regcred \ --set registry.automationInterval=1m \ --set git.pollInterval=1m \ --set git.url=git@gitlab.com:[repo].git \ --set git.branch=develop \ --set git.path="fluxcd/helm\,fluxcd/namespaces\,fluxcd/releases/dev" \ --set git.label=[project]-dev \ --set prometheus.enabled=true \ --namespace fluxcd \ fluxcd/flux
Expected behaviour
Flux and HelmOperator are running and listening on GitLab changes
Logs
`bpint@mac-01-52:~/Developer/bs/k8s-cluster$ kc logs -f flux-5f78d75468-jbt94 Flag --git-verify-signatures has been deprecated, changed to --git-verify-signatures-mode, use that instead ts=2020-07-16T10:05:33.818586156Z caller=main.go:259 version=1.20.0 ts=2020-07-16T10:05:33.818634392Z caller=main.go:412 msg="using kube config: \"/root/.kube/config\" to connect to the cluster" ts=2020-07-16T10:05:33.843227855Z caller=main.go:492 component=cluster identity=/etc/fluxd/ssh/identity ts=2020-07-16T10:05:33.843282699Z caller=main.go:493 component=cluster identity.pub="ssh-rsa [ssh key - didn't want to share, so, removed from the log]" ts=2020-07-16T10:05:33.843314734Z caller=main.go:498 host=https://10.100.0.1:443 version=kubernetes-v1.16.8-eks-fd1ea7 ts=2020-07-16T10:05:33.843369625Z caller=main.go:510 kubectl=/usr/local/bin/kubectl ts=2020-07-16T10:05:33.844074742Z caller=main.go:527 ping=true ts=2020-07-16T10:05:33.847280738Z caller=main.go:666 url=ssh://git@gitlab.com/bimspot/k8s-cluster.git user="Weave Flux" email=support@weave.works signing-key= verify-signatures-mode=none sync-tag=bimspot-dev state=git readonly=false registry-disable-scanning=false notes-ref=bimspot-dev set-author=false git-secret=false sops=false ts=2020-07-16T10:05:33.84783731Z caller=main.go:772 upstream="no upstream URL given" ts=2020-07-16T10:05:33.848180111Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads" ts=2020-07-16T10:05:33.848217612Z caller=images.go:27 component=sync-loop msg="no automated workloads" ts=2020-07-16T10:05:33.848422907Z caller=loop.go:108 component=sync-loop err="loading last-synced resources: git repo not ready: git repo has not been cloned yet" ts=2020-07-16T10:05:33.849280735Z caller=main.go:795 addr=:3030 ts=2020-07-16T10:05:34.330334888Z caller=checkpoint.go:24 component=checkpoint msg="up to date" latest=1.20.0 ts=2020-07-16T10:05:46.507824549Z caller=loop.go:134 component=sync-loop event=refreshed url=ssh://git@gitlab.com/bimspot/k8s-cluster.git branch=develop HEAD=12d643d4debaeabefab20f956e4a66232ce76b94 ts=2020-07-16T10:05:46.514036608Z caller=sync.go:60 component=daemon info="trying to sync git changes to the cluster" old= new=12d643d4debaeabefab20f956e4a66232ce76b94 panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x176c94f]
goroutine 82 [running]: github.com/fluxcd/flux/pkg/cluster/kubernetes/resource.Load.func1(0xc0000b8750, 0x26, 0x0, 0x0, 0x20f5120, 0xc0006ef0b0, 0x0, 0xc0006e6870) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/resource/load.go:32 +0x6f path/filepath.Walk(0xc0000b8750, 0x26, 0xc00072f618, 0x0, 0x0) /usr/local/go/src/path/filepath/path.go:402 +0x6a github.com/fluxcd/flux/pkg/cluster/kubernetes/resource.Load(0xc001d542c0, 0x1a, 0xc0006e6780, 0x3, 0x3, 0xc001d54500, 0x20, 0x20, 0x40c1d6) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/resource/load.go:31 +0x19b github.com/fluxcd/flux/pkg/cluster/kubernetes.(manifests).LoadManifests(0xc000648930, 0xc001d542c0, 0x1a, 0xc0006e6780, 0x3, 0x3, 0xc0000b8840, 0x0, 0x20) /home/circleci/go/src/github.com/fluxcd/flux/pkg/cluster/kubernetes/manifests.go:122 +0x67 github.com/fluxcd/flux/pkg/manifests.(rawFiles).GetAllResourcesByID(0xc00197e540, 0x213dda0, 0xc0000bc058, 0x3990981b40c935b4, 0x39d263b326feea65, 0xc00072f760) /home/circleci/go/src/github.com/fluxcd/flux/pkg/manifests/rawfiles.go:96 +0x60 github.com/fluxcd/flux/pkg/daemon.doSync(0x213dda0, 0xc0000bc058, 0x21257a0, 0xc00197e540, 0x2158440, 0xc000190400, 0xc0000b8840, 0x2b, 0x20f2f20, 0xc00009fc80, ...) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/sync.go:221 +0x66 github.com/fluxcd/flux/pkg/daemon.(Daemon).Sync(0xc000361e60, 0x213dda0, 0xc0000bc058, 0x1e45a32e, 0xed6a21d7a, 0x0, 0xc000500b70, 0x28, 0x21256a0, 0xc0002cc4c0, ...) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/sync.go:71 +0x404 github.com/fluxcd/flux/pkg/daemon.(Daemon).Loop(0xc000361e60, 0xc00009a2a0, 0xc00034ca00, 0x20f2f20, 0xc00009fec0) /home/circleci/go/src/github.com/fluxcd/flux/pkg/daemon/loop.go:103 +0x525 created by main.main /home/circleci/go/src/github.com/fluxcd/flux/cmd/fluxd/main.go:777 +0x5990`
Additional context