Closed rbruggem closed 6 years ago
I have bumped our sock shop cluster to the latest flux (1.2.2) and all other latest weave agents. The mega yaml apply which was previously broken (due to #916) now works. Alas flux is still using about 60% of a core. According to Explore, about half of that is kubectl'ing. It is kubectl apply
ing every ~15s - is that expected?
It is kubectl applying every ~15s - is that expected?
No; left alone, it should run every few minutes. It will be triggered by
fluxctl release ...
, fluxctl automate ...
and the weave cloud equivalent buttons)--git-poll-interval
)So: either the tick has been tuned down radically, or there are constantly new images and the automation interval has been tuned down radically (or, there's a bug).
Is there an indication in the logs of what triggers the sync? I'd expect some messages about jobs running if it's those or automation.
Attaching logs provided by a flux user who experienced this issue flux.log
(bear in mind we may be looking at two distinct problems here, in the two instances we are looking at)
The sock-shop fluxd has no special tuning. I can see no obvious trigger in the logs for the kubectl apply
s, but TBH I don't really know what I am looking for:
ts=2018-02-02T10:57:47.49427888Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2018-02-02T10:57:47.95983943Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/carts-db container=carts-db currentimage=mongo repo=mongo pattern=*
ts=2018-02-02T10:57:47.960130666Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/user container=user currentimage=weaveworksdemos/user:0.4.7 repo=weaveworksdemos/user pattern=*
ts=2018-02-02T10:57:47.960243493Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/shipping container=shipping currentimage=weaveworksdemos/shipping:master-ba362a4c repo=weaveworksdemos/shipping pattern=*
ts=2018-02-02T10:57:47.960450487Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/queue-master container=queue-master currentimage=weaveworksdemos/queue-master:master-b8bcfc79 repo=weaveworksdemos/queue-master pattern=*
ts=2018-02-02T10:57:47.96061811Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/user-db container=user-db currentimage=weaveworksdemos/user-db:0.4.7 repo=weaveworksdemos/user-db pattern=*
ts=2018-02-02T10:57:47.960773579Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/orders-db container=orders-db currentimage=mongo repo=mongo pattern=*
ts=2018-02-02T10:57:47.960956102Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/payment container=payment currentimage=weaveworksdemos/payment:master-b64b8af1 repo=weaveworksdemos/payment pattern=*
ts=2018-02-02T10:57:47.96112886Z caller=images.go:57 component=sync-loop service=sock-shop:deployment/orders container=orders currentimage=weaveworksdemos/orders:master-f8988a3d repo=weaveworksdemos/orders pattern=*
ts=2018-02-02T10:57:51.518949283Z caller=release.go:61 component=platform method=Sync cmd=apply args="--namespace default" count=1
ts=2018-02-02T10:57:52.064705257Z caller=release.go:100 component=platform method=Sync cmd="kubectl --namespace default apply -f -" took=545.657746ms err=null output="namespace \"sock-shop\" configured"
ts=2018-02-02T10:57:52.066493091Z caller=release.go:61 component=platform method=Sync cmd=apply args= count=29
ts=2018-02-02T10:57:56.555391705Z caller=release.go:100 component=platform method=Sync cmd="kubectl apply -f -" took=4.488656238s err=null output="deployment \"carts-db\" configured\nservice \"carts-db\" configured\ndeployment \"catalogue\" configured\ndeployment \"load-test\" configured\ndeployment \"orders\" configured\ndeployment \"user-db\" configured\nservice \"orders\" configured\nservice \"session-db\" configured\ndeployment \"shipping\" configured\ndeployment \"orders-db\" configured\ndeployment \"payment\" configured\ndeployment \"rabbitmq\" configured\nservice \"user\" configured\ndeployment \"front-end\" configured\ndeployment \"session-db\" configured\nservice \"carts\" configured\nservice \"payment\" configured\nservice \"queue-master\" configured\ndeployment \"carts\" configured\nservice \"front-end\" configured\ndeployment \"queue-master\" configured\nservice \"user-db\" configured\ndeployment \"user\" configured\ndeployment \"catalogue-db\" configured\nservice \"catalogue-db\" configured\nservice \"catalogue\" configured\nservice \"orders-db\" configured\nservice \"rabbitmq\" configured\nservice \"shipping\" configured"
ts=2018-02-02T10:58:06.158436936Z caller=release.go:61 component=platform method=Sync cmd=apply args="--namespace default" count=1
ts=2018-02-02T10:58:06.839208202Z caller=release.go:100 component=platform method=Sync cmd="kubectl --namespace default apply -f -" took=680.582364ms err=null output="namespace \"sock-shop\" configured"
ts=2018-02-02T10:58:06.839294152Z caller=release.go:61 component=platform method=Sync cmd=apply args= count=29
ts=2018-02-02T10:58:13.05913582Z caller=release.go:100 component=platform method=Sync cmd="kubectl apply -f -" took=6.217109974s err=null output="service \"user-db\" configured\ndeployment \"user\" configured\nservice \"orders\" configured\nservice \"rabbitmq\" configured\nservice \"front-end\" configured\ndeployment \"payment\" configured\nservice \"orders-db\" configured\nservice \"session-db\" configured\ndeployment \"catalogue-db\" configured\ndeployment \"catalogue\" configured\ndeployment \"orders-db\" configured\ndeployment \"queue-master\" configured\ndeployment \"user-db\" configured\nservice \"user\" configured\nservice \"carts-db\" configured\nservice \"catalogue-db\" configured\nservice \"payment\" configured\ndeployment \"session-db\" configured\ndeployment \"shipping\" configured\ndeployment \"orders\" configured\nservice \"queue-master\" configured\ndeployment \"rabbitmq\" configured\nservice \"catalogue\" configured\ndeployment \"front-end\" configured\ndeployment \"load-test\" configured\nservice \"shipping\" configured\ndeployment \"carts-db\" configured\ndeployment \"carts\" configured\nservice \"carts\" configured"
ts=2018-02-02T10:58:24.762192817Z caller=release.go:61 component=platform method=Sync cmd=apply args="--namespace default" count=1
ts=2018-02-02T10:58:26.201710224Z caller=release.go:100 component=platform method=Sync cmd="kubectl --namespace default apply -f -" took=1.439309753s err=null output="namespace \"sock-shop\" configured"
ts=2018-02-02T10:58:26.202375657Z caller=release.go:61 component=platform method=Sync cmd=apply args= count=29
ts=2018-02-02T10:58:32.198947907Z caller=release.go:100 component=platform method=Sync cmd="kubectl apply -f -" took=5.99636982s err=null output="deployment \"shipping\" configured\nservice \"shipping\" configured\ndeployment \"user-db\" configured\ndeployment \"carts-db\" configured\nservice \"catalogue-db\" configured\nservice \"session-db\" configured\ndeployment \"front-end\" configured\nservice \"orders\" configured\ndeployment \"queue-master\" configured\ndeployment \"session-db\" configured\nservice \"carts\" configured\ndeployment \"catalogue\" configured\nservice \"catalogue\" configured\ndeployment \"orders-db\" configured\nservice \"carts-db\" configured\nservice \"rabbitmq\" configured\ndeployment \"user\" configured\nservice \"front-end\" configured\nservice \"orders-db\" configured\nservice \"queue-master\" configured\nservice \"user-db\" configured\ndeployment \"carts\" configured\ndeployment \"load-test\" configured\ndeployment \"payment\" configured\ndeployment \"catalogue-db\" configured\ndeployment \"orders\" configured\nservice \"payment\" configured\ndeployment \"rabbitmq\" configured\nservice \"user\" configured"
Right, no messages about running jobs. It does look an awful lot like it's just applying the namespace, applying everything else, then goto 10.
A naive attempt to reproduce this locally fails: I created a fresh minikube and pointed flux at the flux-example repo, and it behaves as expected. So there's something else going on.
The log attached above also shows the sync running every five minutes, when left to its own devices. Perhaps the CPU use there has a different cause.
I could reproduce this on GKE with the following repo https://github.com/stefanprodan/flux-demo The repo has just two deployments, the caddy one is automated while podinfo is not.
I'm using a n1-standard-2 node and after deploying Flux the average CPU usage increased by ~50%. I didn't noticed kubectl running more than once per minute and the CPU usage of kubectl is under 1% while fluxd uses around 50% most of time.
GCP CPU usage
Scope
Cortex
This is what Flux logs every minute:
ts=2018-02-03T08:36:21.588319201Z caller=warming.go:190 component=warmer fetching=nats-streaming total=10 expired=0 missing=10
ts=2018-02-03T08:36:21.732550794Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:21.737490165Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:21.851009807Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:21.864899317Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.019054839Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.03614893Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.047498709Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.076802372Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.090084019Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.107090559Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:36:22.107171959Z caller=warming.go:238 component=warmer updated=nats-streaming count=0
ts=2018-02-03T08:36:22.107275327Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2018-02-03T08:36:22.130578863Z caller=images.go:57 component=sync-loop service=default:deployment/caddy container=caddy currentimage=stefanprodan/caddy:0.10.10 repo=stefanprodan/caddy pattern=*
If I remove the automated deployment the CPU usage stays the same:
ts=2018-02-03T08:47:21.690835107Z caller=warming.go:190 component=warmer fetching=nats-streaming total=10 expired=0 missing=10
ts=2018-02-03T08:47:21.81497541Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.078176684Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.131129014Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.131578471Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.132031944Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.135878971Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.138002694Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.143104445Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.174781168Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.298959979Z caller=warming.go:215 component=warmer canonical_name=index.docker.io/library/nats-streaming auth={map[]} err="requesting manifests: no suitable manifest (linux amd64) in manifestlist"
ts=2018-02-03T08:47:22.299053202Z caller=warming.go:238 component=warmer updated=nats-streaming count=0
ts=2018-02-03T08:47:22.299196829Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2018-02-03T08:47:22.303798986Z caller=images.go:28 component=sync-loop msg="no automated services"
pprof top:
Showing nodes accounting for 24850ms, 87.69% of 28340ms total
Dropped 167 nodes (cum <= 141.70ms)
Showing top 10 nodes out of 17
flat flat% sum% cum cum%
1390ms 4.90% 4.90% 28140ms 99.29% github.com/weaveworks/flux/registry/cache.(*Warmer).Loop /Users/aleph/go/src/github.com/weaveworks/flux/registry/cache/warming.go
10050ms 35.46% 40.37% 21050ms 74.28% runtime.selectgo /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go
1680ms 5.93% 46.29% 4870ms 17.18% runtime.sellock /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go
1400ms 4.94% 51.24% 4200ms 14.82% runtime.selunlock /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go
3200ms 11.29% 62.53% 3200ms 11.29% runtime.lock /usr/local/Cellar/go/1.9.2/libexec/src/runtime/lock_futex.go
2800ms 9.88% 72.41% 2800ms 9.88% runtime.unlock /usr/local/Cellar/go/1.9.2/libexec/src/runtime/lock_futex.go
2320ms 8.19% 80.59% 2790ms 9.84% runtime.selectrecv /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go
780ms 2.75% 83.35% 980ms 3.46% runtime.selectdefault /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go
420ms 1.48% 84.83% 920ms 3.25% runtime.selectnbrecv /usr/local/Cellar/go/1.9.2/libexec/src/runtime/chan.go
810ms 2.86% 87.69% 810ms 2.86% runtime.(*hchan).sortkey /usr/local/Cellar/go/1.9.2/libexec/src/runtime/select.go (inline)
Good sleuthing @stefanprodan!
(For the sake of completeness, I should record that the sock shop was running kubectl
constantly because it had an experimental GitHub push hook configured.)
Noticing multiple
kubectl apply
every second.This occurs with v1.2.0 and v1.2.1.