deis / builder

Git server and application builder for Deis Workflow
https://deis.com
MIT License
40 stars 41 forks source link

High CPU usage on GKE cluster #280

Closed kmala closed 8 years ago

kmala commented 8 years ago

Builder using high CPU on GKE clusters. Replication: Depoly a wokrlfow-dev or workflow-beta1 charts and do a docker stats on the builder container:

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O
567dcf2a8bbe        96.17%              40.31 MB / 3.892 GB   1.04%               0 B / 0 B           991.2 kB / 3.883 MB

top on the node gives

top - 18:13:25 up 23:35,  1 user,  load average: 1.10, 1.31, 1.24
Tasks: 106 total,   1 running, 105 sleeping,   0 stopped,   0 zombie
%Cpu(s): 99.0 us,  1.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   3801024 total,  3286940 used,   514084 free,   383544 buffers
KiB Swap:        0 total,        0 used,        0 free,  1993720 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                                                                                                                       
17835 root      20   0 26220  22m    4 S  96.7  0.6   1251:42 boot                                                                                                                                                                                                                                                                                                          
 3310 root      20   0  232m  76m  30m S   0.7  2.1  16:02.86 kubelet                                                                                                                                                                                                                                                                                                       
 4241 root      20   0  739m 164m 8136 S   0.7  4.4  11:48.61 google-fluentd                                                                                                                                                                                                                                                                                                
 3872 root      20   0 1348m  48m  18m S   0.3  1.3  14:47.96 docker                                                                                                                                                                                                                                                                                                        
17912 root      20   0 65384  49m  13m S   0.3  1.3   0:58.64 registry     
MaxenceAdnot commented 8 years ago

I've seen the exact same behaviour yesterday but scaling the replicas number to 0 and then back to 1 on the deis-builder's RC corrected that.

Before doing the scaling stuff described above I checked the running processes with a "ps faux" and I noticed that a defunct child process linked to boot was consuming a lot of CPU. Have you got the same ?

EDIT : Here are the defunct child processes I've seen but, my bad, they cannot consume CPU by nature and they are probably a kind of trace from a failed push I think.

root     12342 92.9  0.2  23052 18956 ?        Ssl  Mar30 1921:16  \_ boot server
root     25779  0.0  0.0      0     0 ?        Z    Mar30   0:00      \_ [pre-receive] <defunct>
root     26513  0.0  0.0      0     0 ?        Z    Mar30   0:00      \_ [pre-receive] <defunct>
gregzuro commented 8 years ago

I'm seeing 100% CPU on deis-builder with a new install of beta3 that starts spontaneously and continues indefinitely.

kmala commented 8 years ago

Can you try the beta4 because the issue was completely fixed in beta4 and exists in beta3

gregzuro commented 8 years ago

will do. thanks.