pulumi / pulumi-kubernetes-operator

A Kubernetes Operator that automates the deployment of Pulumi Stacks
Apache License 2.0
222 stars 55 forks source link

Set some default resource requests on the workspace pod #698

Closed blampe closed 2 weeks ago

blampe commented 2 weeks ago

Manager has limits on it already -- currently has guaranteed QoS.

Related to #694 and probably a pre-req -- set a small request limit to give the workspace pod burst-able QoS.

Additional considerations:

cleverguy25 commented 2 weeks ago

Added to epic https://github.com/pulumi/pulumi-kubernetes-operator/issues/586

EronWright commented 2 weeks ago

The baseline stats for random-yaml with 1-minute resync interval. Image

EronWright commented 2 weeks ago

Zombie processes do seem to accumulate in the workspace pod, given a per-minute resync:

pulumi@random-yaml-workspace-0:/$ ps auxwww 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pulumi       1  0.0  0.3 1248856 14268 ?       Ssl  16:07   0:01 /share/agent serve --workspace /share/workspace --skip-install
pulumi      46  0.0  0.0      0     0 ?        Z    16:07   0:00 [pulumi-language] <defunct>
pulumi      75  0.0  0.0      0     0 ?        Z    16:07   0:00 [pulumi-language] <defunct>
pulumi     236  0.0  0.0      0     0 ?        Z    16:07   0:00 [pulumi-language] <defunct>
pulumi     256  0.0  0.0      0     0 ?        Z    16:07   0:00 [pulumi-resource] <defunct>
pulumi     271  0.0  0.0      0     0 ?        Z    16:07   0:00 [pulumi-resource] <defunct>
pulumi     400  0.0  0.0      0     0 ?        Z    16:08   0:00 [pulumi-language] <defunct>
pulumi     415  0.0  0.0      0     0 ?        Z    16:08   0:00 [pulumi-resource] <defunct>
pulumi     431  0.0  0.0      0     0 ?        Z    16:08   0:00 [pulumi-resource] <defunct>
pulumi     563  0.0  0.0      0     0 ?        Z    16:09   0:00 [pulumi-language] <defunct>
pulumi     579  0.0  0.0      0     0 ?        Z    16:09   0:00 [pulumi-resource] <defunct>
pulumi     594  0.0  0.0      0     0 ?        Z    16:09   0:00 [pulumi-resource] <defunct>
pulumi     724  0.0  0.0      0     0 ?        Z    16:10   0:00 [pulumi-language] <defunct>
pulumi     739  0.0  0.0      0     0 ?        Z    16:10   0:00 [pulumi-resource] <defunct>
pulumi     753  0.0  0.0      0     0 ?        Z    16:10   0:00 [pulumi-resource] <defunct>
pulumi     886  0.0  0.0      0     0 ?        Z    16:11   0:00 [pulumi-language] <defunct>
pulumi     901  0.0  0.0      0     0 ?        Z    16:11   0:00 [pulumi-resource] <defunct>
pulumi     917  0.0  0.0      0     0 ?        Z    16:11   0:00 [pulumi-resource] <defunct>
pulumi    1044  0.0  0.0      0     0 ?        Z    16:12   0:00 [pulumi-language] <defunct>
pulumi    1059  0.0  0.0      0     0 ?        Z    16:12   0:00 [pulumi-resource] <defunct>
pulumi    1075  0.0  0.0      0     0 ?        Z    16:12   0:00 [pulumi-resource] <defunct>
pulumi    1205  0.0  0.0      0     0 ?        Z    16:13   0:00 [pulumi-language] <defunct>
pulumi    1220  0.0  0.0      0     0 ?        Z    16:13   0:00 [pulumi-resource] <defunct>
pulumi    1236  0.0  0.0      0     0 ?        Z    16:13   0:00 [pulumi-resource] <defunct>
pulumi    1368  0.0  0.0      0     0 ?        Z    16:14   0:00 [pulumi-language] <defunct>
pulumi    1383  0.0  0.0      0     0 ?        Z    16:14   0:00 [pulumi-resource] <defunct>
...
justinvp commented 2 weeks ago

Likely related to https://github.com/pulumi/pulumi/issues/17361

EronWright commented 2 weeks ago

These measurements made after "zombie" process issue was fixed.

After another hour of periodic execution: Image

And another: Image

EronWright commented 2 weeks ago

A case of failed updates causing a lot more interactions with the workspace: Image

EronWright commented 2 weeks ago

With all fixes: Image