Closed Levovar closed 4 years ago
I can prob only test it next week, cause I'm on holiday from Wednesday to Monday
@TimoLindqvist out of WIP because it works! :) On a 2 socket system with the PF on node1: [root@compute-1 cloudadmin]# cat /sys/class/net/ens2f1/device/numa_node 0
and with CPUs distributed between nodes: [cloudadmin@compute-1 ~]$ lscpu -p=cpu,node 0,0 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0 9,0 10,0 11,0 12,0 13,0 14,1 15,1 16,1 17,1 18,1 19,1 20,1 21,1 22,1 23,1 24,1 25,1 26,1 27,1 28,0 29,0 30,0 31,0 32,0 33,0 34,0 35,0 36,0 37,0 38,0 39,0 40,0 41,0 42,1 43,1 44,1 45,1 46,1 47,1 48,1 49,1 50,1 51,1 52,1 53,1 54,1 55,1
I instantiated following Pod a couple of times: apiVersion: v1 kind: Pod metadata: name: sriov-pod labels: env: test annotations: danm.k8s.io/interfaces: | [ {"clusterNetwork":"default", "ip":"dynamic"}, {"clusterNetwork":"sriov", "ip":"none"} ] spec: containers:
And the 5 exclusive CPU cores always come from first NUMA node: [cloudadmin@compute-1 ~]$ docker exec -ti b53fe83a4a5c sh / # cat /proc/1/status | grep -i cpus_allowed Cpus_allowed: 000140,00001c00 Cpus_allowed_list: 10-12,38,40
We need Kubernetes 1.17 for this to work (I tested with release candidate 1), but from our side we are NUMA capable :)
README updated PR is final from my side
Looks basically ok.
Should we have some unit tests. I mean testing different topologies with real hardware is a bit problematic but we could get the topology info as lscpu
output and create unit tests for those ?
Could we support earlier k8s (at least 1.16) version together with the latest (1.17) that supports Topology Manager ? With older version we don't of course have the NUMA alignment functionality.
@TimoLindqvist I updated the README, pls re-check!
UT: at one point we are planning to, but I don't have capacity for it now. and Tamás is still doing product work
K8s compatibility: no it unfortunately cannot work with 1.16, because it needs the enhancement I asked from the community on top of alpha :) https://github.com/kubernetes/kubernetes/pull/83492 Based on the release notes I checked it is only part of 1.17
Ok, this ready to be merged then. I'm just thinking about support for older k8s versions if bug fixes or some other features are needed but 1.17 is not an option yet. Separate branch then ?
I mean Pooler itself works with all K8s versions. Only your exclusive CPU pools won't be topology aligned I don't see a reason for branching, the implementation IMO is backward compatible
To clarify: I had a 4 node setup, and I only updated the worker to K8s 1.17 :) my other 3 nodes were 1.16. DP works perfectly on all nodes! Just prior to 1.17 Topology Manager won't even invoke topology alignment cause the Pods asking for CPU-Pooler managed resources will never belong to the Guaranteed QoS class, as they don't ask default, and exclusive CPU resources at the same time
Solves https://github.com/nokia/CPU-Pooler/issues/24.
And here we are, finally coming to a full circle. The whole reason of "exploiting" the DPAPI from the very beginning now bears its fruit - with a mere ~100 lines of code we introduce native CPU socket alignment capability to CPU-Pooler by simply implementing the TopologyManager related DPAPI changes introduced with K8s 1.16.
We only report the socket ID for devices belonging to the exclusive pool. The socket information is parsed from the output of the " lscpu -p="cpu,node" -J" command