Dashboard OOM: Reading all config maps into memory at once

jgiles commented 6 years ago

Environment

Dashboard version: 1.8.3
Kubernetes version: 1.10.2
Operating system: Debian 9 (Stretch)

We install Dashboard using https://github.com/kubernetes/charts/tree/master/stable/kubernetes-dashboard.

Steps to reproduce

Install Dashboard with some memory limit (default limit for the stable/kubernetes-dashboard Helm chart is a reasonable 50 MiB)
Use Helm to manage a large number of releases in the cluster.
Open the dashboard.

Observed result

Dashboard container is killed for memory limit violation.

Dashboard logs:

2018/05/24 22:29:23 Starting overwatch
2018/05/24 22:29:23 Using in-cluster config to connect to apiserver
2018/05/24 22:29:23 Using service account token for csrf signing
2018/05/24 22:29:23 No request provided. Skipping authorization
2018/05/24 22:29:23 Successful initial request to the apiserver, version: v1.10.2
2018/05/24 22:29:23 Generating JWE encryption key
2018/05/24 22:29:23 New synchronizer has been registered: kubernetes-dashboard-key-holder-kube-system. Starting
2018/05/24 22:29:23 Starting secret synchronizer for kubernetes-dashboard-key-holder in namespace kube-system
2018/05/24 22:29:27 Initializing JWE encryption key from synchronized object
2018/05/24 22:29:27 Creating in-cluster Heapster client
2018/05/24 22:29:28 Successful request to heapster
2018/05/24 22:29:28 Auto-generating certificates
2018/05/24 22:29:28 Successfully created certificates
2018/05/24 22:29:28 Serving securely on HTTPS port: 8443
2018/05/24 22:32:29 [2018-05-24T22:32:29Z] Incoming HTTP/2.0 GET /api/v1/csrftoken/login request from 100.96.2.0:45018: {}
2018/05/24 22:32:29 [2018-05-24T22:32:29Z] Outcoming response to 100.96.2.0:45018 with 200 status code
2018/05/24 22:32:29 [2018-05-24T22:32:29Z] Incoming HTTP/2.0 POST /api/v1/login request from 100.96.2.0:45018: {
  "kubeConfig": "",
  "password": "<snip>",
  "token": "",
  "username": "admin"
}
2018/05/24 22:32:29 [2018-05-24T22:32:29Z] Outcoming response to 100.96.2.0:45018 with 200 status code
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Incoming HTTP/2.0 GET /api/v1/login/status request from 100.96.2.0:45018: {}
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Outcoming response to 100.96.2.0:45018 with 200 status code
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Incoming HTTP/2.0 GET /api/v1/csrftoken/token request from 100.96.2.0:45018: {}
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Outcoming response to 100.96.2.0:45018 with 200 status code
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Incoming HTTP/2.0 POST /api/v1/token/refresh request from 100.96.2.0:45018: {
  "jweToken": "{\"protected\":\"<snip>\",\"aad\":\"<snip>\",\"encrypted_key\":\"<snip>\",\"iv\":\"<snip>\",\"ciphertext\":\"<snip>\",\"tag\":\"Xe95c7a8_Hm8iG5TaCxihA\"}"
}
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Outcoming response to 100.96.2.0:45018 with 200 status code
2018/05/24 22:32:30 [2018-05-24T22:32:30Z] Incoming HTTP/2.0 GET /api/v1/overview?filterBy=&itemsPerPage=10&name=&page=1&sortBy=d,creationTimestamp request from 100.96.2.0:45018: {}
2018/05/24 22:32:30 Getting config category

(logs end there, the container exits)

Relevant journalctl logs from the node:

May 24 22:06:29 ip-10-145-2-198 kernel: dashboard invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=-998
May 24 22:06:29 ip-10-145-2-198 kernel: dashboard cpuset=102ce99ee98b9666c9f922e6ad8fb2632fd03f3f6935903758857bdcfcaf203d mems_allowed=0
May 24 22:06:29 ip-10-145-2-198 kernel: CPU: 0 PID: 9863 Comm: dashboard Not tainted 4.9.0-6-amd64 #1 Debian 4.9.88-1+deb9u1
May 24 22:06:29 ip-10-145-2-198 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
May 24 22:06:29 ip-10-145-2-198 kernel:  0000000000000000 ffffffff9fb2f774 ffffa1fec53f7dd8 ffff929736cb2000
May 24 22:06:29 ip-10-145-2-198 kernel:  ffffffff9fa03020 0000000000000000 00000000fffffc1a ffff929787cb2000
May 24 22:06:29 ip-10-145-2-198 kernel:  0000000000000000 ffffffffa06f16b0 ffffffff9f9fb90c 0000000000000001
May 24 22:06:29 ip-10-145-2-198 kernel: Call Trace:
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9fb2f774>] ? dump_stack+0x5c/0x78
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9fa03020>] ? dump_header+0x78/0x1fd
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9fb90c>] ? mem_cgroup_scan_tasks+0xcc/0x100
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9849ba>] ? oom_kill_process+0x21a/0x3e0
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f984e51>] ? out_of_memory+0x111/0x470
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9f6b09>] ? mem_cgroup_out_of_memory+0x49/0x80
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9fc2c5>] ? mem_cgroup_oom_synchronize+0x325/0x340
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9f7690>] ? mem_cgroup_css_reset+0xd0/0xd0
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f9851df>] ? pagefault_out_of_memory+0x2f/0x80
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9f8611bd>] ? __do_page_fault+0x4bd/0x4f0
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9fe0dbc2>] ? schedule+0x32/0x80
May 24 22:06:29 ip-10-145-2-198 kernel:  [<ffffffff9fe13818>] ? page_fault+0x28/0x30
May 24 22:06:29 ip-10-145-2-198 kernel: Task in /kubepods/pod2688c444-5f96-11e8-980b-022bcd8b57c6/102ce99ee98b9666c9f922e6ad8fb2632fd03f3f6935903758857bdcfcaf203d killed as a result of limit of /kubepods/pod2688c444-5f96-11e8-980b-022bcd8May 24 22:06:29 ip-10-145-2-198 kernel: memory: usage 51200kB, limit 51200kB, failcnt 97
May 24 22:06:29 ip-10-145-2-198 kernel: memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
May 24 22:06:29 ip-10-145-2-198 kernel: kmem: usage 548kB, limit 9007199254740988kB, failcnt 0
May 24 22:06:29 ip-10-145-2-198 kernel: Memory cgroup stats for /kubepods/pod2688c444-5f96-11e8-980b-022bcd8b57c6: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB May 24 22:06:29 ip-10-145-2-198 kernel: Memory cgroup stats for /kubepods/pod2688c444-5f96-11e8-980b-022bcd8b57c6/c6cd9f8cf6f95032bd3e1b1220978819dfbcadd97cc03d3432075054f611c022: cache:0KB rss:36KB rss_huge:0KB mapped_file:0KB dirty:0KB May 24 22:06:29 ip-10-145-2-198 kernel: Memory cgroup stats for /kubepods/pod2688c444-5f96-11e8-980b-022bcd8b57c6/b3b2783aa3726d4629e9914ce090255e116ae0e1d9c6b37a5a52231691bd8b39: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB dirty:0KB wMay 24 22:06:29 ip-10-145-2-198 kernel: Memory cgroup stats for /kubepods/pod2688c444-5f96-11e8-980b-022bcd8b57c6/102ce99ee98b9666c9f922e6ad8fb2632fd03f3f6935903758857bdcfcaf203d: cache:0KB rss:50616KB rss_huge:0KB mapped_file:0KB dirty:0May 24 22:06:29 ip-10-145-2-198 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name                                                                                                                     May 24 22:06:29 ip-10-145-2-198 kernel: [ 7465]     0  7465      256        1       4       2        0          -998 pause
May 24 22:06:29 ip-10-145-2-198 kernel: [ 9853]     0  9853    22136    17049      42       5        0          -998 dashboard
May 24 22:06:29 ip-10-145-2-198 kernel: Memory cgroup out of memory: Kill process 9853 (dashboard) score 400 or sacrifice child                                                                                                               May 24 22:06:29 ip-10-145-2-198 kernel: Killed process 9853 (dashboard) total-vm:88544kB, anon-rss:49940kB, file-rss:18256kB, shmem-rss:0kB                                                                                                   May 24 22:06:29 ip-10-145-2-198 kernel: oom_reaper: reaped process 9853 (dashboard), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Expected result

No crash.

Comments

Our obvious short-term solution is to give Dashboard more memory. However, it looks like Dashboard might be reading the contents of many (all?) config maps into memory at once, based on tracing the code from the last log statement above:

https://github.com/kubernetes/dashboard/blob/2bedc9eb4be9cf6c92125cb37638dff032d7dabf/src/app/backend/resource/config/config.go#L43

This seems likely to interact very poorly with Helm, since Helm creates config maps to store data about each release. For example, running against the cluster in question:

$ kubectl get configmaps --all-namespaces | wc -l
    2599
$ kubectl get configmaps --all-namespaces --output json | wc -c
 17052730

(we've only been operating in this cluster for a few weeks, so the problem is likely to get steadily worse)

We will also take steps to limit the number of artifacts Helm leaves around, but it seems like Dashboard shouldn't be trying to pull them all down at once.

cheld commented 6 years ago

In general, 50 mb is not enough, because the API server does not provide paging. So, the dashboard backend always reads all data in the backend and is paging on client side. Typically, 300 mb is configured:

https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dashboard/dashboard-controller.yaml#L38

floreks commented 6 years ago

For a large cluster, it is best to increase memory limits for Dashboard. API server does not support pagination and we do have to load all resources into the memory.

kubernetes / dashboard