docker / cli

The Docker CLI
Apache License 2.0
4.88k stars 1.92k forks source link

docker stats CPU above 100% #2134

Open cdalexndr opened 4 years ago

cdalexndr commented 4 years ago

Description docker stats CPU shows values above 100%.

Steps to reproduce the issue:

  1. run docker stats while a container is using high cpu

Describe the results you received: CPU column shows values above 100% (110%, 250%...)

Describe the results you expected: CPU column values should be normalized to 100%. Conceptually, header CPU % means max 100%.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        6a30dfc
 Built:             Thu Aug 29 05:26:49 2019
 OS/Arch:           windows/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:32:21 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 11
  Running: 11
  Paused: 0
  Stopped: 0
 Images: 251
 Server Version: 19.03.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.9.184-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.837GiB
 Name: docker-desktop
 ID: PL7Z:37ZA:FGN5:EBPE:KYFT:HSFI:YXJP:2MOK:ESB3:MML3:6G22:7ZIK
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 143
  Goroutines: 152
  System Time: 2019-10-11T20:46:04.5961211Z
  EventsListeners: 4
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Additional environment details (AWS, VirtualBox, physical, etc.): Using 4 core CPU.

AkihiroSuda commented 4 years ago

This is expected behavior on multi-core host

ibrahimawadhamid commented 4 years ago

I'm having a similar issue with geoserver 2.14.3 , but I can't figure it out.

AkihiroSuda commented 4 years ago

This is not an issue. If you have N CPU cores, the CPU usage can be up to N * 100%.

cdalexndr commented 4 years ago

I don't think there is a system monitor tool (linux or windows) that shows CPU usage above 100%. Conceptually, 100 percent means maximum value.

ibrahimawadhamid commented 4 years ago

The main issue is that geoserver is consuming all of these resource when nothing is being processed at all! not even accessing it through the web. Shouldn't geoserver in his idle state just be consuming memory and not a lot of CPU ?!

AkihiroSuda commented 4 years ago

I don't think there is a system monitor tool (linux or windows) that shows CPU usage above 100%.

systemd-cgtop

Shouldn't geoserver in his idle state just be consuming memory and not a lot of CPU ?!

Seems an issue on geoserver.

raags commented 4 years ago

Cpu % is calculated using deltas of the total_usage as per https://github.com/docker/cli/blob/6c12a82f330675d4e2cfff4f8b89a353bcb1fecd/cli/command/container/stats_helpers.go#L180

Here the ratio is multiplied by the number of CPUs. However, cpuDelta already includes usage across all CPUs (see below). Adding all values in percpu_usage gives total_usage (which is used to derive cpuDelta).

        "cpu_stats": {
            "cpu_usage": {
                "percpu_usage": [
                    826860687,
                    830807540,
                    823365887,
                    844077056
                ],
                "total_usage": 3325111170,
                "usage_in_kernelmode": 1620000000,
                "usage_in_usermode": 1600000000
            },
            "online_cpus": 4,
            "system_cpu_usage": 35595977360000000,
            "throttling_data": {
                "periods": 0,
                "throttled_periods": 0,
                "throttled_time": 0
            }
        },

So what is the reason to multiply by the number of cpus? Seem that is not required because total_usage already accounts for them.

ibrahimawadhamid commented 4 years ago

I found and resolved the issue. I had my health check in my docker-compose.yml configured to try every 2 seconds. I changed it to 3 minutes and now everything is fine.

raags commented 4 years ago

Ok, but the logic above doesn't seem right. @AkihiroSuda could you please review? I couldn't find anything in git blame.

Just adding that this may not be related to the original issue - I can open a new issue if required.

Hronom commented 4 years ago

Looks strange to me, as I need to google why my containers get above 100% and do math calculations with cores.

Maybe better to have there two things, one is common cpu usage with max 100% and other there it's like load for cores

frankandrobot commented 2 years ago

We're trying to figure out if our container is using CPUs in a healthy way. Can someone clarify how we can do this on a multicore machine? For instance, If i'm understanding correctly, if you have 4 cores and 100% CPU usage, then that's either the 4 cores running at 25% each OR 1 core running at 100%? The former seems "healthy" while the latter is potentially a problem (at least in our use case).

thaJeztah commented 2 years ago

@frankandrobot you can connect to the API endpoint to get the raw information ; https://docs.docker.com/engine/api/v1.41/#operation/ContainerStats

chandrashekhar07 commented 3 months ago

As I understand, docker stats gives the CPU percentage relative to the allocated CPU resources. ( i.e. cpu-shares). But here, cpu-shares itself is a relative value used for scheduling CPU time between different containers. Therefore, we can't directly get an absolute measure of CPU utilization (i.e., how much of the host's CPU capacity a container is using) from docker stats. Instead, it shows us how much of the container's allocated CPU resources are being used.