intel / kubernetes-power-manager

Apache License 2.0
83 stars 18 forks source link

Cores are not removed from PowerWorkload after deleting Pod that was using them. #19

Closed kamillipka closed 1 year ago

kamillipka commented 2 years ago

Hi, currently we are running tests to check if power-manager is compatible with Red Hat Openshift. Our testing method is to run in parallel three pods with stress-ng that use different PowerProfiles (performance, balance-performance and shared).

Testing Environment: RedHat Openshift 4.11.0-rc.1 with cgroups v2.

During testing I found that in cpuIds in PowerWorkload are not being removed after deleting Pod that was using them. List of all our PowerWorkloads:

apiVersion: v1
items:
- apiVersion: power.intel.com/v1
  kind: PowerWorkload
  metadata:
    creationTimestamp: "2022-09-09T10:52:48Z"
    generation: 72
    name: balance-performance-worker2
    namespace: intel-power
    resourceVersion: "175349940"
    uid: 9df2d926-62ee-4005-876e-6e84a7dc56c2
  spec:
    name: balance-performance-worker2
    powerProfile: balance-performance
    workloadNodes:
      name: worker2
- apiVersion: power.intel.com/v1
  kind: PowerWorkload
  metadata:
    creationTimestamp: "2022-09-05T13:08:40Z"
    generation: 1
    name: balance-power-worker2
    namespace: intel-power
    resourceVersion: "146184712"
    uid: 5ac1ecf4-4b05-425f-9d00-c6112f8f249f
  spec:
    name: balance-power-worker2
    powerProfile: balance-power
    workloadNodes:
      name: worker2
- apiVersion: power.intel.com/v1
  kind: PowerWorkload
  metadata:
    creationTimestamp: "2022-09-09T12:54:05Z"
    generation: 41
    name: performance-worker2
    namespace: intel-power
    resourceVersion: "175349944"
    uid: 2d89c590-3a1b-4561-b2f4-aaa6db4769e6
  spec:
    name: performance-worker2
    powerProfile: performance
    workloadNodes:
      cpuIds:
      - 1
      - 2
      - 3
      - 4
      - 5
      - 6
      - 7
      - 8
      - 9
      - 10
      - 11
      - 12
      name: worker2
- apiVersion: power.intel.com/v1
  kind: PowerWorkload
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"power.intel.com/v1","kind":"PowerWorkload","metadata":{"annotations":{},"creationTimestamp":"2022-08-16T14:53:53Z","generation":1,"name":"shared-worker2-workload","namespace":"intel-power","resourceVersion":"92564361","uid":"bc04cceb-ea00-45f8-9866-7cf98f471c2f"},"spec":{"allCores":true,"name":"shared-worker2-workload","powerNodeSelector":{"kubernetes.io/hostname":"worker2"},"powerProfile":"shared","reservedCPUs":[0,1]}}
    creationTimestamp: "2022-09-05T13:10:16Z"
    generation: 1
    name: shared-worker2-workload
    namespace: intel-power
    resourceVersion: "146187718"
    uid: ea9e5607-908d-4e8c-b159-bcd747479aea
  spec:
    allCores: true
    name: shared-worker2-workload
    powerNodeSelector:
      kubernetes.io/hostname: worker2
    powerProfile: shared
    reservedCPUs:
    - 0
    - 1
kind: List
metadata:
  resourceVersion: ""

Cores on tested node are also stuck in EPP configured for powerWorkload, which in this example is performance.

[core@worker2 ~]$ cat /sys/devices/system/cpu/cpu{0..40}/cpufreq/energy_performance_preference
power
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
power
patricia-cahill commented 1 year ago

Apologies for the late reply, an oversight on behalf. Is this still an issue? if so can you send on more details. Also maybe try the latest release v2.1.0 and see are you getting the same result!

patricia-cahill commented 1 year ago

This issues has been addressed and resolved in the latest release