API Server memory/goroutines increase over time after v1.11 upgrade

cespo commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug /sig api-machinery

What happened: After v1.11 upgrade the API Server memory started steadily increasing using eventually all resources. At the same time we observed goroutines and open fds going up

Graphs: https://www.dropbox.com/sh/9fla8up2t70b80d/AAAnAmDydT5gF0OMz9rIfsvBa?dl=0

What you expected to happen: API Server to release chunks of memory after the use. Possible memory leak in the API Server process

How to reproduce it:

upgrade Kubernetes to v1.11

Anything else we need to know?: The cluster has ~45k namespaces

Environment:

Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): CoreOS-stable-1800.7.0
Kernel (e.g. uname -a): Linux ip-10-155-64-206.ec2.internal 4.14.63-coreos #1 SMP Wed Aug 15 22:26:16 UTC 2018 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux
Install tools: Terraform

dims commented 6 years ago

@liggitt any thoughts?

liggitt commented 6 years ago

a few questions:

do you have your full apiserver invocation available?
are you running any extension apiservers (like service catalog or metrics server)?
what version of etcd are you running?

liggitt commented 6 years ago

@kubernetes/sig-scalability-bugs have we seen anything like this in our scale tests?

cespo commented 6 years ago

are you running any extension apiservers (like service catalog or metrics server)?

Yes, we are running metrics-server

what version of etcd are you running?

etcdctl version: 3.3.8

We don't see the memory spikes on different cluster with ~40 namespaces running the same config

lavalamp commented 6 years ago

/assign @cheftako

tuminoid commented 6 years ago

I'm getting something similar to this on k8s 1.10.4 (etcd 3.2.22) with no pods running beyond monitoring (Prometheus, metrics-server, kube-state-metrics, influxdb, grafana, heapster).

Based on previous comments, removed metrics-server 0.3.0, and memory consumption dropped about 100MB, but memory usage keeps growing, just much slower.

after-metrics-server-killed

edit: wrong graph

salilgupta1 commented 6 years ago

We're loading testing our cluster as well to figure out what levers to pull to be able to support 750 nodes. However, we are seeing similar issues with with the api server swallowing memory along with high response latency, high number of dropped requests ... essential it isn't scaling :(

Cluster config:

K8s Version: 1.11
etcd Version: 2.2.1
3 Master nodes running on m4.4xlarge (16 cores/ 64 Gib mem) instances
We do have the metrics server running (gcr.io/google_containers/metrics-server-amd64:v0.2.1)

Memory Usage screen shot 2018-09-07 at 11 42 17 am

CPU Usage screen shot 2018-09-07 at 11 43 02 am

Number of Nodes, pods, containers Note: We actually had closer to 584 nodes at the peak of our screen shot 2018-09-07 at 11 43 43 am

99th and 95th percentile screen shot 2018-09-07 at 11 46 50 am

Goroutines screen shot 2018-09-07 at 11 54 29 am

open File descriptors screen shot 2018-09-07 at 11 53 47 am

Our load can be bursty since we use this cluster to to run client jobs. There could be a large influx of jobs at any given point which can lead to bursts of pods and instances into the cluster. From previous load tests we have seen that CPU will increase steadily with the burst of nodes and pods added but then come back down once the burst has been handled. However, we have never seen this issue with memory.

Any help would be appreciated!

gyuho commented 6 years ago

@tuminoid Can you share metrics output from etcd?

tuminoid commented 6 years ago

@gyuho etcd metrics: https://gist.github.com/tuminoid/12e5cc36d9d866379553cedd1326438f

Left this empty system running over the weekend (without metrics server), and it actually seems to cap around 3GB. Its a single master/worker node for testing, with no load.

over-weekend

On another cluster, with 3 masters, without monitoring, but with our application running, it was linear growth to 10GB over 72 hours, then it purged 8GB and started creeping up again. In both cases, apiserver did not crash.

And for reference, apiserver invocation:

      command:
        - /usr/local/bin/kube-apiserver
        - --v=0
        - --logtostderr=false
        - --log-dir=/var/log/kubernetes/apiserver
        - --allow-privileged=true
        - --delete-collection-workers=4
        - --repair-malformed-updates=false
        - --apiserver-count=1
        - --request-timeout=1m
        - --event-ttl=8h
        - --profiling=false
        - --advertise-address=192.168.10.30
        - --bind-address=192.168.10.30
        - --secure-port=6443
        - --insecure-port=0
        - --service-cluster-ip-range=10.254.0.0/16
        - --storage-backend=etcd3
        - --etcd-servers=https://192.168.10.30:2379
        - --etcd-cafile=/etc/etcd/ssl/etcd-ca.pem
        - --etcd-certfile=/etc/etcd/ssl/etcd.pem
        - --etcd-keyfile=/etc/etcd/ssl/etcd-key.pem
        - --enable-admission-plugins=NamespaceLifecycle,LimitRanger,SecurityContextDeny,ServiceAccount,NodeRestriction,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds,AlwaysPullImages,DenyEscalatingExec
        - --disable-admission-plugins=PersistentVolumeLabel
        - --authorization-mode=RBAC,Node
        - --service-account-key-file=/etc/kubernetes/ssl/serviceaccount.pem
        - --service-account-lookup=true
        - --client-ca-file=/etc/kubernetes/ssl/ca.pem
        - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
        - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
        - --kubelet-https=true
        - --kubelet-certificate-authority=/etc/kubernetes/ssl/ca.pem
        - --kubelet-client-certificate=/etc/kubernetes/ssl/apiserver.pem
        - --kubelet-client-key=/etc/kubernetes/ssl/apiserver-key.pem
        - --kubelet-timeout=15s
        - --feature-gates=AdvancedAuditing=false
        - --audit-log-format=legacy
        - --audit-log-path=/var/log/kubernetes/audit/audit.log
        - --audit-log-maxage=30
        - --audit-log-maxbackup=10
        - --audit-log-maxsize=100
        - --requestheader-client-ca-file=/etc/kubernetes/ssl/ca.pem
        - --requestheader-allowed-names=aggregator
        - --requestheader-extra-headers-prefix=X-Remote-Extra-
        - --requestheader-group-headers=X-Remote-Group
        - --requestheader-username-headers=X-Remote-User

tuommaki commented 6 years ago

We hit this same issue and I've been investigating this for a while now and seems that this is coming from apimachinery.

Following describes process rss memory usage for one of our apiservers over last two days: k8s-apiserver-memory-rss

Two from the left are upstream v1.11.1, the red spikey one is my experiment with forced debug.FreeOSMemory() call once a minute and rightmost is v1.11.1 with this apimachinery patch.

We tried earlier various v1.11.x releases but this problem appeared in all of them. Now with this apimachinery patch we see only very gradual increase in memory usage and so far I'm unable to clearly point out from where that is coming. The difference to release is still big so it seems that this patch is crucial for v1.11.x.

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

cheftako commented 5 years ago

/remove-lifecycle stale

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

dims commented 5 years ago

we can close this right?

/close

k8s-ci-robot commented 5 years ago

@dims: Closing this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/67732#issuecomment-483675657): >we can close this right? > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes / kubernetes

API Server memory/goroutines increase over time after v1.11 upgrade #67732