nokia / CPU-Pooler

A Device Plugin for Kubernetes, which exposes the CPU cores as consumable Devices to the Kubernetes scheduler.
BSD 3-Clause "New" or "Revised" License
93 stars 22 forks source link

Support for cgroupv2 #77

Open gmarkey opened 1 year ago

gmarkey commented 1 year ago

Hoping that this project isn't abandoned; so here goes :)

Is your feature request related to a problem? Please describe. Currently, only cgroupv1 appears to be supported by CPU-Pooler. Most operating systems are moving to CGv2, which appears to break CPU-Pooler functionality. Additionally, after the deprecation of the Intel CPU manager, CPU-Pooler appears to now be the canonical way of achieving effective core isolation for low latency applications.

Describe the solution you'd like CPU-Pooler should be able to:

Describe alternatives you've considered Intel CPU manager, which is now deprecated. The embedded CPU scheduling of K8s is insufficient for certain workloads.

Additional context tasks path in CGv2 has changed, for example:

exclusive path in CGv2 has changed, for example:

Levovar commented 1 year ago

it isn't necessarily abandoned but also not being actively and consistently developed :) im happily commenting/reviewing/merging PRs but not necessarily finding the time to implement improvements myself

CGv2 is definitely and issue ye, but have you checked out upstream CPU Manager recently? there were a couple features Red Hat brought over the last couple of releases (exclusivity options, system reserved cpusets etc) which might make it a viable alternative for your use-case

gmarkey commented 1 year ago

Thanks @Levovar; I'll take a look at the recent improvements (CPU manager & upstream topology manager were definitely insufficient for my needs previously).

I think the "killer" feature of CPU Pooler was passing the exclusive/shared pools as variables for consumption by applications. In fact; I wrote tooling to mimic this functionality for our applications outside of K8 as it is so helpful for hybrid workloads. Maybe this is something I can emulate in container init by inspecting the cgroup membership.