cloudfoundry / diego-release

BOSH Release for Diego
Apache License 2.0
201 stars 212 forks source link

The memory is used very heavily in the cell VM #187

Closed bingosummer closed 8 years ago

bingosummer commented 8 years ago

In my deployment, if I push a static website app (https://github.com/bingosummer/2048), I find the memory is used very heavily and the free memory is very low in the cell VM. Is it by design? Is the memory (3381480 used) reserved by cell? Thanks in advance.

Tasks: 137 total,   1 running, 136 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.2 sy,  0.0 ni, 98.9 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   3523108 total,  3381480 used,   141628 free,   189840 buffers
KiB Swap:  3526260 total,      260 used,  3526000 free.  2827988 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
    1 root      20   0   33464   3848   2608 S  0.0  0.1   0:06.80 init
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.01 kthreadd
    3 root      20   0       0      0      0 S  0.0  0.0   0:14.51 ksoftirqd/0
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    7 root      20   0       0      0      0 S  0.0  0.0   0:03.44 rcu_sched
    8 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh
    9 root      20   0       0      0      0 S  0.0  0.0   0:02.90 rcuos/0
   10 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcuob/0
   11 root      rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0
   12 root      rt   0       0      0      0 S  0.0  0.0   0:00.59 watchdog/0
   13 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 khelper
   14 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kdevtmpfs
   15 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 netns
   16 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 perf
   17 root      20   0       0      0      0 S  0.0  0.0   0:00.02 khungtaskd
   18 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 writeback
   19 root      25   5       0      0      0 S  0.0  0.0   0:00.00 ksmd
   20 root      39  19       0      0      0 S  0.0  0.0   0:00.19 khugepaged
   21 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 crypto
   22 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kintegrityd
   23 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 bioset
   24 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kblockd
   25 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 ata_sff
   26 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 md
   27 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 devfreq_wq
   31 root      20   0       0      0      0 S  0.0  0.0   0:00.11 kswapd0
   32 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 vmstat
   33 root      20   0       0      0      0 S  0.0  0.0   0:00.00 fsnotify_mark
   34 root      20   0       0      0      0 S  0.0  0.0   0:00.00 ecryptfs-kthrea
   46 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kthrotld
   47 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 acpi_thermal_pm
   48 root      20   0       0      0      0 S  0.0  0.0   0:00.00 scsi_eh_0
   49 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 scsi_tmf_0
   50 root      20   0       0      0      0 S  0.0  0.0   0:00.02 scsi_eh_1
   51 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 scsi_tmf_1
cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/128635439

The labels on this github issue will be updated when the story is started.

emalm commented 8 years ago

Hi, @bingosummer,

That amount of memory usage sounds abnormal, but your top output also doesn't show what processes are actually using that memory. Could you please sort the output by that memory usage and report back?

Also, what versions of CF, Diego, and Garden-Linux do you have deployed? Are you running any other app instances on this cell?

Thanks, Eric, CF Runtime Diego PM

emalm commented 8 years ago

Closing due to inactivity.

bingosummer commented 8 years ago

@ematpl Sorry for the late response. Could you please re-open this issue?

Versions:

I find the memory usage is very high even I don't push an app. The following top output is from a fresh CF deployment without app running on it.

top - 01:54:02 up 17 min,  1 user,  load average: 0.56, 0.73, 0.41
Tasks: 136 total,   2 running, 134 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   3523108 total,  3389464 used,   133644 free,    61456 buffers
KiB Swap:  3526260 total,      232 used,  3526028 free.  2957668 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 7582 root      10 -10  200904  30460  11692 S  0.0  0.9   0:09.88 garden-linux
 1696 root      10 -10  363900  20684  10708 S  0.0  0.6   0:01.79 bosh-agent
 7523 vcap      10 -10   23780  17456  11904 S  0.0  0.5   0:00.63 consul
 8007 vcap      10 -10  145628  16808  11784 S  0.0  0.5   0:00.11 rep
 8122 vcap      10 -10  197132  16504  10420 S  0.0  0.5   0:00.11 metron
  878 root      10 -10  211184  16396   7216 S  0.0  0.5   0:01.19 python
 8901 root      20   0   97152   5912   4976 S  0.0  0.2   0:00.01 sshd
 1864 root      20   0   61380   5340   4676 S  0.0  0.2   0:00.00 sshd
    1 root      20   0   33352   3852   2592 S  0.0  0.1   0:02.72 init
 8914 bosh_5u+  20   0   19788   3828   3328 S  0.0  0.1   0:00.00 bash
  298 root      20   0   51404   3364   2824 S  0.0  0.1   0:00.10 systemd-udevd
 8913 bosh_5u+  20   0   97152   3256   2332 S  0.0  0.1   0:00.01 sshd
 1798 root      10 -10   10224   3176    880 S  0.0  0.1   0:00.00 dhclient
 8114 syslog    20   0  418096   3032   2424 S  0.0  0.1   0:00.00 rsyslogd
 8929 bosh_5u+  20   0   21580   3008   2516 R  0.0  0.1   0:00.00 top
 1965 root      10 -10   91480   2756   2384 S  0.0  0.1   0:00.47 monit
  542 statd     20   0   21544   2548   2088 S  0.0  0.1   0:00.00 rpc.statd
  524 root      20   0   23424   2320   2012 S  0.0  0.1   0:00.00 rpcbind
  749 root      20   0   23656   2300   2036 S  0.0  0.1   0:00.00 cron
 7503 vcap      10 -10   17972   2276   2012 S  0.0  0.1   0:00.00 agent_ctl
 7504 vcap      10 -10   17972   2276   2012 S  0.0  0.1   0:00.00 agent_ctl
 1162 root      20   0   14540   2136   1980 S  0.0  0.1   0:00.00 getty
  724 root      20   0   14540   2112   1960 S  0.0  0.1   0:00.00 getty
  728 root      20   0   14540   2112   1960 S  0.0  0.1   0:00.00 getty
  722 root      20   0   14540   2064   1916 S  0.0  0.1   0:00.00 getty
  729 root      20   0   14540   2056   1908 S  0.0  0.1   0:00.00 getty
  731 root      20   0   14540   2048   1892 S  0.0  0.1   0:00.00 getty
  530 root      20   0   15264   1868   1628 S  0.0  0.1   0:00.05 upstart-socket-
  367 root      20   0   15280   1844   1612 S  0.0  0.1   0:00.08 upstart-file-br
  819 root      16  -4   28444   1828   1588 S  0.0  0.1   0:00.00 auditd
 7496 vcap      10 -10    7552   1804   1680 S  0.0  0.1   0:00.00 awk
  821 root      12  -8   80264   1736   1552 S  0.0  0.0   0:00.01 audispd
 8036 vcap      10 -10   17988   1612   1340 S  0.0  0.0   0:00.00 rep_as_vcap
 8037 vcap      10 -10   17988   1612   1340 S  0.0  0.0   0:00.00 rep_as_vcap
 7517 vcap      10 -10    4344   1592   1488 S  0.0  0.0   0:00.00 logger
cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/129814073

The labels on this github issue will be updated when the story is started.

emalm commented 8 years ago

Thanks for the additional info, @bingosummer. In the top output, I don't see that any process is consuming more than 1% of the memory on your system. Although you don't have a lot of explicitly free memory (133644 KiB), most of the used memory is allocated to buffers and cached pages (61456 and 2957668, respectively). Subtracting those from the 3389464 KiB of used memory yields only 370340 KiB = 361 MiB of actively used memory. Your swap usage is also minimal, indicating that the system isn't under much actual memory pressure.

If you run free -m, you can see these computations in MiB. Here's an example from a Diego cell with about 25 mostly idle containers running on an m3.large AWS instance:

# free -m
             total       used       free     shared    buffers     cached
Mem:          7479       6511        967          1        260       4896
-/+ buffers/cache:       1354       6124
Swap:         7483         15       7468

http://www.linuxatemyram.com/ is also an entertaining explanation of how Linux makes use of memory in ways that may appear confusing at first glance.

Based on the additional data, it appears that everything is fine with your system, so I'll close this out again.

Thanks again, Eric

bingosummer commented 8 years ago

Thanks for your explanation. Very helpful. @ematpl