nokia / CPU-Pooler

A Device Plugin for Kubernetes, which exposes the CPU cores as consumable Devices to the Kubernetes scheduler.
BSD 3-Clause "New" or "Revised" License
93 stars 22 forks source link

Adding auto-provisioned CFS quotas to all non-default containers #55

Closed Levovar closed 3 years ago

Levovar commented 3 years ago

This commit solves Issue #25. When a container is using shared pool resources, the CFS quota is set to its limit value (current behaviour).

With exclusive users it is set to the total amount of all exclusive cores * 1000

When both are requested the overall quota is set to exclusive1000 + 1.2shared In this hybrid scenario we leave a 20% safety margin on top of the originally requested shared resources, to avoid accidentally throttling the higher prio exclusive threads when the lower prio shared threads are overloaded.

TimoLindqvist commented 3 years ago

It looks like there were other indentation issues too so gofmt could be used to get consistent formatting.

Did you have chance to test the behaviour with exclusive cores after the CFS quota is enabled ? I'm still wondering if it could have any side effects that could affect to latency or jitter ? Should we have a configuration option to disable this ?

Levovar commented 3 years ago

yeah I just made the code for now, not gonna merge before some testing. also haven't run code formatter yet, will do if the addition of the limit does not seem to affect performance in the slightest I would make it default. if it turns out it does, I will definitely hide it behind a config flag

Levovar commented 3 years ago

in lieu of a real DPDK workload I ran some tests with a CPU benchmarking tool, called sysbench in both cases I executed the same CentOS 7 base container on the same node with the same Pod spec, asking for 1 exclusive core the first test was done with the mainline webhook, the second with the one from the PR

performance without limits:

[root@pooler-benchmark /]# sysbench cpu  run
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 10000

Initializing worker threads...

without limit:
Threads started!

CPU speed:
    events per second:   951.44

General statistics:
    total time:                          10.0010s
    total number of events:              9517

Latency (ms):
         min:                                    1.05
         avg:                                    1.05
         max:                                    1.41
         95th percentile:                        1.06
         sum:                                 9998.25

performance with limits:

[root@pooler-benchmark /]# cat /sys/fs/cgroup/cpu,cpuacct/cpu.cfs_quota_us
100000

[root@pooler-benchmark /]# sysbench cpu run
sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time

Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:   951.70

General statistics:
    total time:                          10.0003s
    total number of events:              9519

Latency (ms):
         min:                                    1.05
         avg:                                    1.05
         max:                                    1.13
         95th percentile:                        1.06
         sum:                                 9997.73

Threads fairness:
    events (avg/stddev):           9519.0000/0.00
    execution time (avg/stddev):   9.9977/0.00

In both cases core utilization was 99.6-99.7% I executed the tests on both environment multiple times, the results are always identical.

IMO there is not anecdotal, or physical evidence as of now which would suggest CFS quotas can negatively affect the performance of a DPDK type workload on newer kernel

Levovar commented 3 years ago

@TimoLindqvist I also added an extra 100ms to the limits of purely exclusive containers to avoid accidentally throttling at peak / 100% utilization too so even if there would be some artifical performance hits introduced by CFS quotas in edge cases, I think this would totally eliminate that as well

from my perspective this should address the issue without any hiccups, WDYT?

TimoLindqvist commented 3 years ago

Based on those results there seems to be no issues. But I wasn't expecting very big effects. The max latency, which seems to be even better with the CFS quota, is quite big compared to average latency. So I'm thinking if there is some noise in the test setup which hides the CFS quota effect (which I think would be very small if there is any). The variation in the test is small (~0.35 ms) but if the system is supposed to handle 10Mpps or more, that kind of latency variation is quite big.

But I'm supporting to add this feature as I really believe there is no effect. As there is small uncertainty in this issue, it would be nice to have an option to disable this (argument to webhook?). I tried to think about some workaround in case problems occur but I couldn't find a way to disable this. Manually it should work but that's not so robust solution.

Levovar commented 3 years ago

Added a new config flag to control CFS quota provisioning

flag is generic, could be expanded with more options in the future but for now the option to restrict quotaing to shareds, or allow it for exclusives as well serves the purpose

TimoLindqvist commented 3 years ago

This looks good! IMO this can be merged.

Levovar commented 3 years ago

tested the flagged version on a real environment, looks to be ok so here we go