[x] I searched existing issues before opening this one
Expected behavior
RHEL kernel OOM kills a docker container when the memory usage reaches the memory limit set by kubernetes and docker.
Actual behavior
I am having a container with a single JVM process OOM killed when I do not expect it to be. My JVM's Xmx is 2GB, and RSS is at no more than 2.2GB as reported by ps on the system and by kubernetes/heapster. The kubernetes/docker memory limit has been set to 3GB to allow space for the native JVM memory usage above the Xmx.
The workingSetBytes and usageBytes, as well as docker stats start at values matching the RSS, but grow over time until they hit the 3GB limit and OOM. The RSS remains at 2.2GB.
Steps to reproduce the behavior
The process is a KairosDB JVM. Additional details may be recorded in the forum post https://groups.google.com/forum/#!topic/kairosdb-group/aVm8M7yYXIA
KairosDB does minimal disk activity, as it uses Cassandra as its datastore. However, there may be the occasional write to /tpm/kairos_cache/, which is deleted shortly after being written. What I am seeing is that the usageBytes grows whenever these writes are happening. I am monitoring the io activity of KairosDB with iotop. It's only doing a few MBs of writes every 5 minutes.
I've started logging the docker stats, ps and cgroup memory.stat each minute. Here are some statistics from near the start of a test run (once everything is up and initialized), and several hours later in the run. For ease of reading I've merged the two results into one output:
I probably should have also been logging /sys/fs/cgroup/memory/kubepods/burstable/pod2e255ab3-624c-11e9-81f6-06c1f8d31d89/dd630103a0803f7934a14f3d3dc92c64b5442de6bd1aba23371bbc50affe3c1e/memory.usage_in_bytes, but it appears to be growing with the docker stats {{.MemUsage}}.
It is possible this is either a duplicate or closely related to https://github.com/docker/for-linux/issues/651, since the memory only grows when KairosDB does any disk writes to /tmp/kairos_cache. I am trying to get a better understanding of what memory usage is causing the OOM to know for sure. The total_active_file and cache seems no where near enough to be causing the growth. Even if it was the file cache, the kernel should know to remove it when the space is needed.
Expected behavior
RHEL kernel OOM kills a docker container when the memory usage reaches the memory limit set by kubernetes and docker.
Actual behavior
I am having a container with a single JVM process OOM killed when I do not expect it to be. My JVM's Xmx is 2GB, and RSS is at no more than 2.2GB as reported by
ps
on the system and by kubernetes/heapster. The kubernetes/docker memory limit has been set to 3GB to allow space for the native JVM memory usage above the Xmx.The workingSetBytes and usageBytes, as well as
docker stats
start at values matching the RSS, but grow over time until they hit the 3GB limit and OOM. The RSS remains at 2.2GB.Steps to reproduce the behavior
The process is a KairosDB JVM. Additional details may be recorded in the forum post https://groups.google.com/forum/#!topic/kairosdb-group/aVm8M7yYXIA KairosDB does minimal disk activity, as it uses Cassandra as its datastore. However, there may be the occasional write to
/tpm/kairos_cache/
, which is deleted shortly after being written. What I am seeing is that the usageBytes grows whenever these writes are happening. I am monitoring the io activity of KairosDB withiotop
. It's only doing a few MBs of writes every 5 minutes.Here is the OOM message.
I've started logging the
docker stats
,ps
and cgroupmemory.stat
each minute. Here are some statistics from near the start of a test run (once everything is up and initialized), and several hours later in the run. For ease of reading I've merged the two results into one output:I probably should have also been logging
/sys/fs/cgroup/memory/kubepods/burstable/pod2e255ab3-624c-11e9-81f6-06c1f8d31d89/dd630103a0803f7934a14f3d3dc92c64b5442de6bd1aba23371bbc50affe3c1e/memory.usage_in_bytes
, but it appears to be growing with thedocker stats {{.MemUsage}}
.It is possible this is either a duplicate or closely related to https://github.com/docker/for-linux/issues/651, since the memory only grows when KairosDB does any disk writes to
/tmp/kairos_cache
. I am trying to get a better understanding of what memory usage is causing the OOM to know for sure. Thetotal_active_file
andcache
seems no where near enough to be causing the growth. Even if it was the file cache, the kernel should know to remove it when the space is needed.Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.)