FeralInteractive / gamemode

Optimise Linux system performance on demand
BSD 3-Clause "New" or "Revised" License
4.85k stars 188 forks source link

Automatic cpu pinning pins programs to only 4 cores on Intel 13900 #498

Open wereii opened 2 months ago

wereii commented 2 months ago

As the title says, on my machine with an Intel 13900kf processor, automatic core pinning (pin_cores=yes, or commented out ie the default) will pick out the cpus 8-11, which is just 2 cores with 4 threads in total.

# lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0 0:0:0:0          yes 5500.0000 800.0000 1100.0551
  1    0      0    0 0:0:0:0          yes 5500.0000 800.0000  800.0000
  2    0      0    1 4:4:1:0          yes 5500.0000 800.0000 1020.2340
  3    0      0    1 4:4:1:0          yes 5500.0000 800.0000 1023.4670
  4    0      0    2 8:8:2:0          yes 5500.0000 800.0000 1100.0000
  5    0      0    2 8:8:2:0          yes 5500.0000 800.0000  800.0000
  6    0      0    3 12:12:3:0        yes 5500.0000 800.0000 1100.0031
  7    0      0    3 12:12:3:0        yes 5500.0000 800.0000  800.0000
  8    0      0    4 16:16:4:0        yes 5800.0000 800.0000  976.1010
  9    0      0    4 16:16:4:0        yes 5800.0000 800.0000  799.0430
 10    0      0    5 20:20:5:0        yes 5800.0000 800.0000 1404.8610
 11    0      0    5 20:20:5:0        yes 5800.0000 800.0000 1502.6730
 12    0      0    6 24:24:6:0        yes 5500.0000 800.0000  801.1090
 13    0      0    6 24:24:6:0        yes 5500.0000 800.0000  800.0000
 14    0      0    7 28:28:7:0        yes 5500.0000 800.0000 1040.9830
 15    0      0    7 28:28:7:0        yes 5500.0000 800.0000 1099.2321
 16    0      0    8 32:32:8:0        yes 4300.0000 800.0000  800.0090
 17    0      0    9 33:33:8:0        yes 4300.0000 800.0000 1991.7040
 18    0      0   10 34:34:8:0        yes 4300.0000 800.0000  860.5860
 19    0      0   11 35:35:8:0        yes 4300.0000 800.0000  800.0000
 20    0      0   12 36:36:9:0        yes 4300.0000 800.0000  800.0350
 21    0      0   13 37:37:9:0        yes 4300.0000 800.0000  800.0000
 22    0      0   14 38:38:9:0        yes 4300.0000 800.0000  800.0000
 23    0      0   15 39:39:9:0        yes 4300.0000 800.0000  800.0000
 24    0      0   16 40:40:10:0       yes 4300.0000 800.0000  800.0000
 25    0      0   17 41:41:10:0       yes 4300.0000 800.0000  800.0000
 26    0      0   18 42:42:10:0       yes 4300.0000 800.0000  800.0000
 27    0      0   19 43:43:10:0       yes 4300.0000 800.0000  800.0000
 28    0      0   20 44:44:11:0       yes 4300.0000 800.0000  800.0000
 29    0      0   21 45:45:11:0       yes 4300.0000 800.0000  800.0000
 30    0      0   22 46:46:11:0       yes 4300.0000 800.0000  800.0000
 31    0      0   23 47:47:11:0       yes 4300.0000 800.0000  800.0000

Quick look around the code and in issues tells me this is currently basically by design, at least going by the "picking cores with most maxfreq", though because of the rather atypical max freq spread between even the p-cores on this processor it will pin games to CPU 8-11 (cores 4 and 5).
In my case, I've noticed very low fps in Helldivers 2 ~70 FPS instead of ~140 with no pinning.

I can already see that there is a check that won't try pinning less then 4 cores. I guess it's not really feasible to make the automatic pinning algorithm universal across all the possibilities with p/e cores but one idea here would be to also log a warning if the autopinning results in less then X% (let's say 10%) cores of the total core count ?

HenrikHolst commented 1 month ago

The main question is why only some are reporting a max of 5.8Ghz while the other P cores are reporting a max of 5.5Ghz. Are only some cores boostable on the 13900?

edit: ok so in the current code we used 5% as the safety margin for boost and that is too small here, 10% works fine but ofc is that enough for future cpus or should be do this some other way.

Anyway quick fix here is to change line 128 in daemon/gamemode-cpu.c from unsigned long long cutoff = (freq * 5) / 100; to unsigned long long cutoff = (freq * 10) / 100;

wereii commented 1 month ago

Thanks for looking into this.

I can't verify if I have the correct max frequencies listed for this CPU as not even Intel seems to have these details published somewhere, though I think some posts in the Arch forum did mention there is a variation even between the P-Cores. So I am assuming my lscpu output is correct.

HenrikHolst commented 1 month ago

Thanks for looking into this.

I can't verify if I have the correct max frequencies listed for this CPU as not even Intel seems to have these details published somewhere, though I think some posts in the Arch forum did mention there is a variation even between the P-Cores. So I am assuming my lscpu output is correct.

Oh I have no doubt that it is correct, only puzzled why Intel didn't add info about which cores are P and which ones are E instead leaving us to use the frequency to try and determine which one is which..

wereii commented 1 month ago

For what it's worth, in this case all the P-cores are also the only ones with multiple threads, but there indeed does not seem to be a direct indicator for the distinction, at least not in the kernel.

Here is also how inxi shows it:

# inxi -C
CPU:
  Info: 24-core (8-mt/16-st) model: 13th Gen Intel Core i9-13900KF bits: 64
    type: MST AMCP cache: L2: 32 MiB
  Speed (MHz): avg: 1100 min/max: 800/5500:5800:4300
# ...
HenrikHolst commented 1 month ago

For what it's worth, in this case all the P-cores are also the only ones with multiple threads, but there indeed does not seem to be a direct indicator for the distinction, at least not in the kernel.

Here is also how inxi shows it:

# inxi -C
CPU:
  Info: 24-core (8-mt/16-st) model: 13th Gen Intel Core i9-13900KF bits: 64
    type: MST AMCP cache: L2: 32 MiB
  Speed (MHz): avg: 1100 min/max: 800/5500:5800:4300
# ...

Yeah the P cores have SMT on the 12900, 13900 and the 14900 but then on the new 245 and 285 they have dropped SMT so there the P cores have 1 thread just like the E cores.

HenrikHolst commented 4 weeks ago

a quicker fix is to otherwise simple change say pin_cores in gamemode.ini from "yes" to "0-15"