weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes
https://www.weave.works/oss/scope/
Apache License 2.0
5.85k stars 709 forks source link

scope memory usage reported for container doesn't agree with process within (nor output from docker stats) #3376

Open aleks-mariusz opened 5 years ago

aleks-mariusz commented 5 years ago

i have a lab cluster, running kubernetes v1.11.3 which works fine but when using scope it seems to report memory usage of a few select containers a lot more than the processes within use. asked on slack and was told it gets the info from docker stats. i checked docker stats but it too reports less for the container than what scope reports the container uses.

scope-container-memory-bigger-its-than-process

notice the es(-data-N) containers, are using almost 4gb of memory where as the process within are only using just over two.

the output of docker stats:

# docker stats --no-stream $(docker ps | awk '{if(NR>1) print $NF}')|grep es-data
CONTAINER                                                                                        CPU %               MEM USAGE / LIMIT       MEM %               NET I/O             BLOCK I/O           PIDS
k8s_es-data_es-data-0_monitoring_920b7e00-c641-11e8-b1ad-525400b8f61a_0                          1.44%               2.355 GiB / 15.51 GiB   15.18%              0 B / 0 B           8.19 kB / 31 GB     48
k8s_POD_es-data-0_monitoring_920b7e00-c641-11e8-b1ad-525400b8f61a_0                              0.00%               44 KiB / 15.51 GiB      0.00%               0 B / 0 B           0 B / 0 B           1

What you expected to happen?

container memory usage is in line with process it contains

What happened?

memory used by container reported looks almost twice that of what process it contains is

How to reproduce it?

run elasticsearch with -Xm{s,x}=2048m

Anything else we need to know?

baremetal kubernetes 1.11.3, configured for HA using kubeadm, using scope v1.9.1

Versions:

$ scope version
Version1.9.1onweave-scope-app-8656595bb6-mq2fh
$ docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-68.gitdded712.el7.centos.x86_64
 Go version:      go1.9.4
 Git commit:      dded712/1.13.1
 Built:           Tue Jul 17 18:34:48 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-68.gitdded712.el7.centos.x86_64
 Go version:      go1.9.4
 Git commit:      dded712/1.13.1
 Built:           Tue Jul 17 18:34:48 2018
 OS/Arch:         linux/amd64
 Experimental:    false
$ uname -a
Linux k8lab8bs 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:36:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

Logs:

$ kubectl logs <weave-scope-pod> -n <namespace> 
<app> INFO: 2018/10/08 16:03:01.187090 app starting, version 1.9.1, ID 47979c18f321316b
<app> INFO: 2018/10/08 16:03:01.187192 command line args: --mode=app
<app> INFO: 2018/10/08 16:03:01.192531 listening on :4040
<app> WARN: 2018/10/08 16:03:01.209638 Error updating weaveDNS, backing off 20s: Error running weave ps: exit status 1: "Link not found\n". If you are not running Weave Net, you may wish to suppress this warning by launching scope with the `--weave=false` option.
<app> WARN: 2018/10/08 16:03:21.224632 Error updating weaveDNS, backing off 40s: Error running weave ps: exit status 1: "Link not found\n". If you are not running Weave Net, you may wish to suppress this warning by launching scope with the `--weave=false` option.
aleks-mariusz commented 5 years ago

any updates on this ? there doesn't appear to have been any new releases to try (at least not since july 23rd when 1.9.1. was released).

bboreham commented 5 years ago

Currently, Scope reports RSS for the process, and Docker's "Usage" stat for the container. I see that in https://github.com/docker/cli/pull/80 Docker's command-line tool changed to print a different stat - "Usage" minus "page cache".

To check, you can look at the contents of /proc/<pid>/status on the host and see which numbers match what appears on the screen.

Broadly, memory consumption is a complicated subject and when you reduce it to a single figure you will always lose something. But I agree a GUI tool like Scope should at least look consistent.

If this is the problem then this is a duplicate of #1133.

aleks-mariusz commented 5 years ago

this issue looks definitely related, but whether they are duplicates of each other hard to say.. that one issues has the process as larger than the container, in my situation the container is larger than the process.

so i took a look at the /proc//status and indeed see over 4gb being reported as used. here's the corresponding ps output:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1000      4205  3.5 15.1 3870748 2456872 ?     Ssl  Oct12 2125:19 /usr/lib/jvm/default-jvm/jre/bin/java ...  org.elasticsearch.bootstrap.Elasticsearch
1000      5401  3.7 15.0 3856684 2446464 ?     Ssl  Nov02 1086:17 /usr/lib/jvm/default-jvm/jre/bin/java ... org.elasticsearch.bootstrap.Elasticsearch

here we see that VSZ is ~1.5x times that of RSS (and yet scope reports it as the RSS size). VSZ column corresponds nicely with the VmSize line from /proc//status, and the RSS column correspond nicely to the VmRSS line from /proc//status.

however, the container information (which i infer you're saying it comes from docker stats output) shows:

# docker stats --no-stream $(docker ps | awk '{if(NR>1) print $NF}')|grep es-data
k8s_es-data_es-data-1_monitoring_820e2a87-ded1-11e8-a4bb-5254004c9f89_0                                          12.59%              2.326 GiB / 15.51 GiB   15.00%              0 B / 0 B           10.2 MB / 107 GB    49
k8s_es-data_es-data-2_monitoring_663cb8a1-ce30-11e8-a4bb-5254004c9f89_0                                          0.24%               2.337 GiB / 15.51 GiB   15.06%              0 B / 0 B           120 MB / 220 GB     52

Hence i am unclear still why the container size according to scope, is on the order of 4gb (each, for these two containers in my sample above), whereas docker stats says they're only 2gb each..

is scope obtaining its figures from something different than docker stats output ?