allow policyscript to include/exclude CPUs

phemmer commented 11 months ago

Currently irqbalance has the environment variable IRQBALANCE_BANNED_CPUS (and IRQBALANCE_BANNED_CPULIST) which allows for excluding CPUs from any IRQ assignment. It would be nice if we could be more granular and only apply this to specific IRQs.

I currently have an issue with a driver when its IRQs get assigned to CPUs >= 64. While this is a problem with the driver, I wanted to temporarily work around the problem by excluding CPUs >=64 from being used on those IRQs.

Regarding whether this should be a whitelist or blacklist, matching the IRQBALANCE_BANNED_CPUS functionality would mean blacklist. However in my case, not all my systems have the same CPU count. So with a blacklist, to avoid having to count the CPUs and construct the appropriate mask, I have to provide a mask big enough to cover all the different systems' hardware. This does work, as it seems irqbalance ignores the oversized mask bits, but a whitelist might be more succinct. A whitelist would also be more inline with the existing /proc/irq/$irq/smp_affinity provided by the kernel.

So ultimately the request is to add support for an instruction to be emitted by the policy script to whitelist or blacklist (whichever is decided) CPUs for the specific IRQ indicated to the script.

liuchao173 commented 9 months ago

If CPUs >=64 are in another numa node, you can use the numa_node setting in your policy script to driver's IRQs to a specific numa node. Or you can use ban=true in your policy script, and manually set IRQs' smp_affinity once.

nhorman commented 9 months ago

I'm not sure I see the difficulty in computing a run time blacklist. You should be able to do that with 2 linen in your policy script:

BINMASK=$(for i in $(awk '/processor/ {print $3}' /proc/cpuinfo); do if [ $i -ge 24 ]; then echo -n 0; else echo -n 1; fi; done)
printf "%x" "$((2#$BINMASK))"

This is adjusted of course for illustration, but it should work for any boundary you want

Additionally, as @liuchao173 noted, you can also just return ban=true and set your affinity manually for the irq in question

phemmer commented 9 months ago

If CPUs >=64 are in another numa node, you can use the numa_node setting in your policy script to driver's IRQs to a specific numa node.

This is not the case

Or you can use ban=true in your policy script, and manually set IRQs' smp_affinity once.

On one system I have 336 interrupts that would have to be banned, and thus manually set. This means strategically placing these interrupts such that they are evenly distributed. Doing this manually is extremely difficult, and is the whole point of using irqbalance.

I'm not sure I see the difficulty in computing a run time blacklist. You should be able to do that with 2 linen in your policy script:

I'm not following your example script. It outputs a number not a key=value pair. The irqbalance man page mentions no support for this. Is this some undocumented feature?

nhorman commented 9 months ago

Its not the entire script, I only meant to demonstrate that building a custom mask at run time based on the number of cpus shouldn't be particularly difficult, regardless of if we did whitelists and blacklists. Although looking at it, I could have sworn I added a key to the policy script that allowed for a per-irq banning mask, but I don't see it now. Let me try dig that up and apply it

phemmer commented 5 months ago

Was this implemented? Says "closed this as completed", but I don't see any commits adding the functionality, and the documentation doesn't show anything.

nhorman commented 5 months ago

No, I had assumed by your lack of response over the last few months , that my suggestion on creating a custom mask worked for you, or that, if you wanted the functionality you would submit a PR for it.

phemmer commented 5 months ago

I'm really confused. You seemed to acknowledge that the needed functionality is missing:

I could have sworn I added a key to the policy script that allowed for a per-irq banning mask, but I don't see it now.

And that you probably had the code around somewhere:

Let me try dig that up and apply it

nhorman commented 5 months ago

oh, sorry, you're right, I didn't update you on that. Apologies. I looked and it turns out I didn't ever implement this, or if I did, its been lost to time. That said, I'm not in a position to (re)create that work at the moment. Are you interested in writing a PR?

phemmer commented 5 months ago

Maybe. C isn't my strongest language, but I can usually get by. Though we are in the middle of upgrading both the systems with the hardware causing our issue, and upgrading to kernel 6.1 which has some major changes to XDP and seems to have solved the issue. Also I'm about to head out on vacation. So to be completely honest, while it's possible I may tackle it, it's probably not likely. If you want to close that's fine. I just was confused as if it were addressed, I definitely would have utilized the functionality instead of the hack we've currently got in place.

nhorman commented 5 months ago

yeah, I was cleaning up issues, and totally missed my last comment on this one. I'll leave it open. Perhaps I'll get to it in a bit.

Irqbalance / irqbalance

allow policyscript to include/exclude CPUs #289