linrunner / TLP

TLP - Optimize Linux Laptop Battery Life
https://linrunner.de/tlp
GNU General Public License v2.0
2.51k stars 129 forks source link

Option to set IRQ affinity #736

Closed vient closed 4 hours ago

vient commented 3 months ago

Is your feature request related to a problem? Please describe.

I observed what is happening with my system's battery life for some time. Adding nohz_full and irq_nocbs changed things a little, but what really improved battery life then was moving IRQs to E-cores.

I have a laptop with Core Ultra 155H which features two super effective cores in addition to usual P and E cores on new Intels. These two are 20 and 21, so I did for f in /proc/irq/*/smp_affinity_list; do echo 20-21 > $f; done to move as much IRQs to LPE cores. As a result, battery life seemingly improved by 20+%. I can't back it up with statistics yet but what I saw in several hours was pretty exciting.

Describe the solution you'd like

My use case: moving interrupts to energy-efficient cores on battery mode.

In a simplest form I see this as a one new variable in a config, IRQ_AFFINITY_LIST_BAT, which in my case would be 20-21. When tlp detects battery mode, it moves all IRQs to these cores if variable is defined. Then when AC is plugged, default affinity is restored (need to think how to do this, can you just write 0 there?).

Should IRQ_AFFINITY_LIST_AC also be available? Personally I don't have use case for it if there is a simple method to automatically move back IRQs to all CPUs.

Is some kind of IRQ blacklist needed? My system works fine after trying to move all interrupts. Surely, some of them just declined to move, guess it means that blacklist is not needed.

Describe alternatives you've considered

The only automatic alternative that I thought about is moving IRQs at boot with a small systemd oneshot service. It will not be able to move IRQs back on AC though.

linrunner commented 3 months ago

Hi,

first of all, I am of the opinion that the Intel developers should provide reasonable, energy-saving defaults in the kernel. But of course that doesn't help in the short term.

Since I don't own any hardware with P- and E-Cores, I couldn't test it myself, which makes the development quite tedious. However, I would be open to a well-tested pull request from your side.

I think it would be best if the code could determine the E cores automatically so that the user doesn't have to do it themselves. Then you would only need a distinction Y/N.

Incidentally, I do consider both an _ON_BAT and an _ON_AC parameter to be necessary. There will always be users who want the higher performance for AC.

ps. We also need a way for tlp-stat -p to display the status in a concise form.

vient commented 3 months ago

There will always be users who want the higher performance for AC.

Default affinity "all" pretty much does it, but if we add IRQ affinity setting on battery, makes sense to also add for AC for people like audio engineers or gamers who may want move interrupts from some isolated cores.

I think it would be best if the code could determine the E cores automatically so that the user doesn't have to do it themselves.

True. How do you see it, a few special values in addition to normal affinity lists as they are describes in kernel docs? Like "all", "p-cores", "e-cores" and "lpe-cores" which tlp will replace with "0-21", "0-11, "12-19" and "20-21" on my machine respectively.

"5,14-15,lpe-cores", for example, will then be passed as "5,14-15,20-21" — what do you think?

Hope I won't need to delve into Intel Thread Director for this one, it's not even in the mainline kernel yet.

Also, just for the lols, my 155H cpu actually has more like 4 classes of cores: P-cores 0-1 (0-3 logical) have higher turbo frequency than the rest. Guess we will understand what to do with this fact along the way.

vient commented 2 months ago

Started to implement it here https://github.com/linrunner/TLP/compare/main...vient:TLP:add-irq-affinity-option?expand=1 Currently it supports basic affinity lists (without any preprocessing, so no cpu classes yet) and include/exclude list. Seems to work fine on my machine (I've set IRQ_AFFINITY_LIST_ON_BAT=20-21 and IRQ_AFFINITY_EXCLUDE="204 225")

Would appreciate if you take a quick look and say that I'm not doing something terribly wrong there :slightly_smiling_face:


Now, supporting aliases like "low-power" is completely different work.. My current plan is to find where lscpu -e takes MAXMHZ info, and split cores in tiers based on that. For example, my system shows

$ lscpu -e
CPU NODE SOCKET CORE ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0    yes 4800.0000 400.0000 1989.3700
  1    0      0    0    yes 4800.0000 400.0000  400.0000
  2    0      0    1    yes 4800.0000 400.0000  400.0000
  3    0      0    1    yes 4800.0000 400.0000  400.0000
  4    0      0    2    yes 4600.0000 400.0000  400.0000
  5    0      0    2    yes 4600.0000 400.0000  400.0000
  6    0      0    3    yes 4600.0000 400.0000  400.0000
  7    0      0    3    yes 4600.0000 400.0000  400.0000
  8    0      0    4    yes 4600.0000 400.0000  400.0000
  9    0      0    4    yes 4600.0000 400.0000  400.0000
 10    0      0    5    yes 4600.0000 400.0000  400.0000
 11    0      0    5    yes 4600.0000 400.0000  400.0000
 12    0      0    6    yes 3800.0000 400.0000  400.0000
 13    0      0    7    yes 3800.0000 400.0000  400.0000
 14    0      0    8    yes 3800.0000 400.0000  400.0000
 15    0      0    9    yes 3800.0000 400.0000  400.0000
 16    0      0   10    yes 3800.0000 400.0000  400.0000
 17    0      0   11    yes 3800.0000 400.0000  400.0000
 18    0      0   12    yes 3800.0000 400.0000  400.0000
 19    0      0   13    yes 3800.0000 400.0000  400.0000
 20    0      0   14    yes 2500.0000 400.0000  400.0000
 21    0      0   15    yes 2500.0000 400.0000  400.0000

which should result in 4 tiers: tier0=0-3 tier1=4-11 tier2=12-19 tier3=20-21 Also tierMin is needed, I think, as an alias to tier3 in my case

Did not see a proper way to find P,E,LPE cores yet, may be possible by reading some MSRs

linrunner commented 2 months ago

True. How do you see it, a few special values in addition to normal affinity lists as they are describes in kernel docs? Like "all", "p-cores", "e-cores" and "lpe-cores" which tlp will replace with "0-21", "0-11, "12-19" and "20-21" on my machine respectively.

Fine if your code does the translation.

Did not see a proper way to find P,E,LPE cores yet, may be possible by reading some MSRs

No additional external tools which are not already present in every Linux installation. lscpu from core-utils is fine I guess. Apart from that, just what a shell script can do.

Please add your code to 10-tlp-func-cpu. I suggest you create a branch apart from main in your fork.

vient commented 2 months ago

Reference, what Intel engineers think about differentiating P and E cores in OpenVINO https://github.com/openvinotoolkit/openvino/blob/releases/2024/0/src/inference/src/os/lin/lin_system_conf.cpp#L693

linrunner commented 12 hours ago

@vient Any progress to report?

vient commented 4 hours ago

No, lost interest