Closed arjunsbabu closed 5 years ago
please see line 12 . I got this error while i checked why flux havent updated the changes done 30 minutes back. ( sync interval is 5minutes and git poll interval is 1m) After 10 minutes the changes are deployed automatically. Ie some temporary issue. Line 12 gives some clue i guess
ts=2019-09-23T14:06:00.619549213Z caller=sync.go:148 info="cluster resource not in resources to be synced; deleting" dry-run=false resource=bankstage:rolebinding/rb1
ts=2019-09-23T14:06:00.619580711Z caller=sync.go:148 info="cluster resource not in resources to be synced; deleting" dry-run=false resource=<cluster>:project/bankstage
ts=2019-09-23T14:06:00.619610646Z caller=sync.go:153 warning="resource to be synced has not been updated; skipping" dry-run=false resource=eda-demo:resourcequota/eda-demo-quota
ts=2019-09-23T14:06:00.619633347Z caller=sync.go:148 info="cluster resource not in resources to be synced; deleting" dry-run=false resource=bank:limitrange/bank-limits
ts=2019-09-23T14:06:00.619648944Z caller=sync.go:148 info="cluster resource not in resources to be synced; deleting" dry-run=false resource=bank:rolebinding/rb1
ts=2019-09-23T14:06:00.619704562Z caller=sync.go:479 method=Sync cmd=delete args= count=20
ts=2019-09-23T14:06:00.634591957Z caller=sync.go:545 method=Sync cmd="kubectl delete -f -" took=14.84856ms err="running kubectl: " output=
ts=2019-09-23T14:06:06.460748151Z caller=loop.go:206 component=sync-loop tag=flux-sync old=d2b2ddf066fcef90b3b9be21bf4d59e756401a54 new=5f086fefbc583298e81b7f88c047cc562a88a4df
W0923 14:09:02.337791 8 reflector.go:289] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: watch of *v1beta1.CustomResourceDefinition ended with: The resourceVersion for the provided watch is too old.
ts=2019-09-23T14:13:21.380818655Z caller=loop.go:85 component=sync-loop err="git repo not ready: git fetch --tags origin []: running git command: git [fetch --tags origin]: context deadline exceeded"
ts=2019-09-23T14:13:21.38091123Z caller=loop.go:104 component=sync-loop url=git@github.mygitrepo:openshif-ha/cloud-dev-projects.git err="git repo not ready: git fetch --tags origin []: running git command: git [fetch --tags origin]: context deadline exceeded"
Looks like a connectivity issue between the cluster and your git server. You can increase the git fetch timeout using the --git-timeout
flag (by default is set to 20s
).
@stefanprodan i will try that...But is that the reason for error in kubectl apply and kubectl delete command. kubectl delete and kubectl apply is done after fetching the changes only right ?
@arjunsbabu with the little information available it is very hard for us to tell what is going wrong at the moment, except that something is not working for two of our users (on totally different Kubernetes setups, which makes it even more complicated). Please try to modify the flag if the timeout issue persist, and see if that resolves the issue.
If this resolves the issue (both for git
and for kubectl
), it should be possible for us to replicate the issue to see what goes wrong inside Flux, and fix it so it does not happen for others.
@stefanprodan @hiddeco increased the timeout to 60s. Still same issue. i can see the flux pod get restarted for some reason . I will share the complete log from start.
arjun@DESKTOP-QO0863U:~/workspace/git/ibmclouddev/projects$ oc logs flux-5cccf65fdd-twbtp -f
ts=2019-09-23T19:41:24.482305523Z caller=main.go:225 version=1.13.3
ts=2019-09-23T19:41:24.48238605Z caller=main.go:317 msg="using in cluster config to connect to the cluster"
ts=2019-09-23T19:41:24.538881242Z caller=main.go:396 component=cluster identity=/etc/fluxd/ssh/identity
ts=2019-09-23T19:41:24.538929397Z caller=main.go:397 component=cluster identity.pub="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAUg7kBvHJp+5lo9J2WsGuu9fjFK4RunN5lmHc4qFMWssiLVjhIJpuAXs1Iczo+k955sT5lG0dZbYLVLNJGHdEllYIKHxVi9+n+k+hQq8AlSGP9WzS27UD997s/3CbZyy9mMKOyN9ylCET0feaDK64wZYRpfQ3K44LO8/5tzTGBzP5aDXsl1BnV7EQmZkZ1sp1DSkyXX2krz5fRrCo3uYjJnaKz2uashEBW1IYisEI8keqeW069xc3U0OHhHE2G/bjMHI9utwB1SpoAtI7fYJTYHsNKIF+VCOdl1eDjhY7cXP2G1zg4jwNyP9ZduZDKat"
ts=2019-09-23T19:41:24.538955096Z caller=main.go:402 host=https://172.30.0.1:443 version=kubernetes-v1.11.0+d4cacc0
ts=2019-09-23T19:41:24.53900911Z caller=main.go:414 kubectl=/usr/local/bin/kubectl
ts=2019-09-23T19:41:24.540113031Z caller=main.go:426 ping=true
ts=2019-09-23T19:41:24.542748737Z caller=main.go:562 url=git@github.mygit.com:openshift-deployha/cloud-dev-projects.git user="Weave Flux" email=support@weave.works signing-key= verify-signatures=false sync-tag=flux-sync notes-ref=flux set-author=false
ts=2019-09-23T19:41:24.542810482Z caller=main.go:623 upstream="no upstream URL given"
ts=2019-09-23T19:41:24.543493231Z caller=main.go:652 metrics-addr=:3031
ts=2019-09-23T19:41:24.543767363Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2019-09-23T19:41:24.543802174Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2019-09-23T19:41:24.543873618Z caller=loop.go:85 component=sync-loop err="git repo not ready: git repo has not been cloned yet"
ts=2019-09-23T19:41:24.544954325Z caller=main.go:644 addr=:3030
ts=2019-09-23T19:41:25.561218747Z caller=checkpoint.go:21 component=checkpoint msg="update available" latest=1.14.2 URL=https://github.com/weaveworks/flux/releases/tag/1.14.2
ts=2019-09-23T19:41:30.67791829Z caller=loop.go:111 component=sync-loop event=refreshed url=git@github.ibm.com:openshift-deployment-ha/ibmcloud-dev-projects.git branch=master HEAD=9802f8fc415a20763ffe6478a8bfa5972b5c784e
ts=2019-09-23T19:44:24.767610833Z caller=sync.go:479 method=Sync cmd=apply args= count=208
ts=2019-09-23T19:44:25.137983552Z caller=sync.go:545 method=Sync cmd="kubectl apply -f -" took=370.128585ms err="running kubectl: " output=
ts=2019-09-23T19:44:25.444583128Z caller=sync.go:545 method=Sync cmd="kubectl apply -f -" took=306.52765ms err="running kubectl: " output=
@arjunsbabu do you have the logs from the previous pod just before it was killed?
no @hiddeco I can see restart happening for flux pod. Why 17 restart has happended not sure. Situation seems to be critical now. The change i deployed yesterday night has been deployed before 2 hrs only. i will try to get the log just before its killed.
arjun@DESKTOP-QO0863U:~/workspace/git/ibmclouddev$ oc get po
NAME READY STATUS RESTARTS AGE
flux-7d89bb67b4-jsrkj 1/1 Running 17 4h
jenkins-2-7gpng 1/1 Running 0 26d
memcached-56c9fccf5d-fjc94 1/1 Running 0 31d
i think the pod is killed due to OOMKilled
arjun@DESKTOP-QO0863U:/mnt/c/Windows/System32$ oc get po -w
NAME READY STATUS RESTARTS AGE
flux-7d89bb67b4-x9nwc 1/1 Running 8 1h
jenkins-2-7gpng 1/1 Running 0 26d
memcached-56c9fccf5d-fjc94 1/1 Running 0 32d
flux-7d89bb67b4-x9nwc 0/1 OOMKilled 8 1h
flux-7d89bb67b4-x9nwc 0/1 CrashLoopBackOff 8 1h
flux-7d89bb67b4-x9nwc 1/1 Running 9 1h
arjun@DESKTOP-QO0863U:/mnt/c/Windows/System32$
i have the logs also
arjun@DESKTOP-QO0863U:~/workspace/git$ oc logs flux-7d89bb67b4-x9nwc -f
ts=2019-09-24T11:07:35.451333425Z caller=main.go:225 version=1.13.3
ts=2019-09-24T11:07:35.45141518Z caller=main.go:317 msg="using in cluster config to connect to the cluster"
ts=2019-09-24T11:07:35.479078098Z caller=main.go:396 component=cluster identity=/etc/fluxd/ssh/identity
ts=2019-09-24T11:07:35.479188815Z caller=main.go:397 component=cluster identity.pub="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCkUg7kBvHJp+5lo9J2WsGuu9fjFK4RunN5lmHc4qFMWssiLVjhIJpuAXsRm2ALVLNJGHdEllYIKHxVi9+n+k+hQq8AlSGP9WzS27UD997s/3CbZyy9mMKOyN9ylCET0feaDK64wZYRpfQ3K44LO8/5tzTGBzP5aDXsl1BnV7EQmZkZ1sp1DSkyXX2krz5fRrCo3uYjJnaKz2uashEBW1IYisEI8keqeW069xc3U0OHhHE2G/bjMHI9utwB1SpoAtI7fYJTYHsNKIF+VCOdl1eDjhY7cXP2G1zg4jwNyP9ZduZDKat"
ts=2019-09-24T11:07:35.479214452Z caller=main.go:402 host=https://172.30.0.1:443 version=kubernetes-v1.11.0+d4cacc0
ts=2019-09-24T11:07:35.479275523Z caller=main.go:414 kubectl=/usr/local/bin/kubectl
ts=2019-09-24T11:07:35.480315047Z caller=main.go:426 ping=true
ts=2019-09-24T11:07:35.482722878Z caller=main.go:562 url=git@github.mygit.com:openshift-deployment-ha/ibmcloud-dev-projects.git user="Weave Flux" email=support@weave.works signing-key= verify-signatures=false sync-tag=flux-sync notes-ref=flux set-author=false
ts=2019-09-24T11:07:35.482775875Z caller=main.go:623 upstream="no upstream URL given"
ts=2019-09-24T11:07:35.483379079Z caller=loop.go:85 component=sync-loop err="git repo not ready: git repo has not been cloned yet"
ts=2019-09-24T11:07:35.483423567Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2019-09-24T11:07:35.483438424Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2019-09-24T11:07:35.48350288Z caller=main.go:652 metrics-addr=:3031
ts=2019-09-24T11:07:35.484502608Z caller=main.go:644 addr=:3030
ts=2019-09-24T11:07:36.298528162Z caller=checkpoint.go:21 component=checkpoint msg="update available" latest=1.14.2 URL=https://github.com/weaveworks/flux/releases/tag/1.14.2
ts=2019-09-24T11:07:41.844801669Z caller=loop.go:111 component=sync-loop event=refreshed url=git@github.mygit.com:openshift-deployment-ha/dev-projects.git branch=master HEAD=728f1b2453dbfc480457da7671ba8933bec67602
arjun@DESKTOP-QO0863U:~/workspace/git/clouddev$
Resources given for flux..Let me increase and check
Limits:
cpu: 500m
memory: 500Mi
Requests:
cpu: 50m
memory: 64Mi
After increasing the request and limit of flux deployment i am not facing the issue. I think it solved the issue @hiddeco @stefanprodan
@arjunsbabu awesome! Given I have been a bit late with my reply and you haven't posted since, I will assume this was indeed the problem and close the issue.
If you are experiencing new issues, or the problem arises again, do not hesitate to either re-open it or open a new one. :tulip:
Describe the bug I have flux installed and its running for many days without any issues till last friday. After that i have noticed flux have restarted almost 113 times and new yamls are not properly applied. A restart of flux solved the issues temporarily but the errors are still coming in yaml.
To Reproduce Steps to reproduce the behaviour:
Expected behavior No error messages and flux should apply all yamls automaticallu
Logs
Additional context Add any other context about the problem here, e.g
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2019-08-01T23:56:00Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
We are not using for any images..We are just using for project onboarding ( project , rolebinding , quota , limitrange )