Closed philwinder closed 8 years ago
Executor is being shut, when it reaches cgroup memory limits. Basically, Kibana runs out of memory. This can be seen from dmesg -T
[2016-02-02 07:09:26] Task in /system.slice/docker-50c016035226708bbb3e86c24f3e5e5a7f7a0b9448ca16b74b7e62781045e91a.scope killed as a result of limit of /system.slice/docker-50c016035226708bbb3e86c24f3e5e5a7f7a0b9448ca16b74b7e62781045e91a.scope
[2016-02-02 07:09:26] memory: usage 1048576kB, limit 1048576kB, failcnt 78
[2016-02-02 07:09:26] memory+swap: usage 1048576kB, limit 9007199254740991kB, failcnt 0
[2016-02-02 07:09:26] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0
[2016-02-02 07:09:26] Memory cgroup stats for /system.slice/docker-50c016035226708bbb3e86c24f3e5e5a7f7a0b9448ca16b74b7e62781045e91a.scope: cache:0KB rss:1048576KB rss_huge:38912KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:1048572KB inactive_file:0KB active_file:0KB unevictable:0KB
[2016-02-02 07:09:26] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[2016-02-02 07:09:26] [ 1946] 999 1946 491442 264248 1039 0 0 node
[2016-02-02 07:09:26] Memory cgroup out of memory: Kill process 1975 (node) score 1011 or sacrifice child
Based on https://github.com/elastic/kibana/issues/5170#issuecomment-163042525 and https://github.com/elastic/kibana/pull/5451, code changes in the framework required to set NODE_OPTIONS
environment variable for executors... and we need to move to version 4.4
currently being tested on Alpha Cluster
It's fixed in 0.3.1
The kibana executors failed 6 times over the course of a weekend (and were restarted, yay mesos!). All other services are running (i.e. ES hasn't shut down over the weekend).
Investigate attached logs to find out why. stderr.txt stdout.txt