ovn-org / ovn-kubernetes

A robust Kubernetes networking platform
https://ovn-kubernetes.io/
Apache License 2.0
767 stars 333 forks source link

ovspinning: Set affinity of each thread #4470

Open zeeke opened 3 days ago

zeeke commented 3 days ago

OVS CPU pinning ensures the CPU consumed by the processes ovsdb-server and ovs-vswitchd are the same as the OVNkube container. ovs-vswitchd process has threads and calling SchedSetaffinity() on the main PID only does not guarantee that the already spawned threads inherit that affinity. The following is an example of such a situation:

sh-4.4# taskset -apc $(pgrep ovs-vswitchd)
pid 4059's current affinity list: 0-2,6,9,15-66,69,70,73,79-127
pid 4335's current affinity list: 0-2,32,33,64-66,96,97
pid 2715051's current affinity list: 0-2,5,6,15-66,69,70,73,79-127
pid 2715052's current affinity list: 0-2,5,6,15-66,69,70,73,79-127
pid 2715053's current affinity list: 0-2,5,6,15-66,69,70,73,79-127
...

This changes makes the CPU align routine loop over all the threads of the given PID and invoke SchedSetaffinity on all of them.

To test these changes, a simple sleep 10 process is not enough, as it doesn't spawn any thread. A go version of the sleep is therefore provided for testing purpose, in the form of a simple main.

What this PR does and why is it needed

Which issue(s) this PR fixes

Special notes for reviewers

How to verify it

Details to documentation updates

Description for the changelog

Does this PR introduce a user-facing change?

cc @MarSik, @ricky-rav

coveralls commented 3 days ago

Coverage Status

Changes unknown when pulling 6135c21635dbe3c9aae19962629922140ed36c10 on zeeke:OCPBUGS-35347 into on ovn-org:master.

MarSik commented 3 days ago

Just one concurrency related comment, but otherwise the logic looks good to me. I would add the bug id to the title, but I see OVN-k is not following that pattern.

zeeke commented 2 days ago

Just one concurrency related comment, but otherwise the logic looks good to me. I would add the bug id to the title, but I see OVN-k is not following that pattern.

Yes, I'll take care of linking the issue when this fix lands downstream

coveralls commented 2 days ago

Coverage Status

Changes unknown when pulling 91400d27b018937e3231c6aa6eb70f6caef07b45 on zeeke:OCPBUGS-35347 into on ovn-org:master.

ricky-rav commented 18 hours ago

/lgtm Thanks, @zeeke!