FeralInteractive / gamemode

Optimise Linux system performance on demand
BSD 3-Clause "New" or "Revised" License
4.81k stars 186 forks source link

Alder lake / Big little optimisation #352

Open Dal78 opened 2 years ago

Dal78 commented 2 years ago

Is there any chance for some Big / Little Alder lake style core steering?

GameMode processes can get the P cores along with some designated kernel elements which are key to performance and the rest can be shunted to E cores.

This may help to optimise the per core boost making it more likely we see higher clocks in lightly threaded games.

Currently nearly every game is defaulting to all core.

Thoughts?

mdiluz commented 2 years ago

I can see this being a good idea if the kernel doesn't do its job properly (or it's decided user space programs should figure things out themselves). That's kind of the same issue that triggered me to make this thing in the first place!

Calinou commented 2 years ago

Looking at the P-State governor Phoronix benchmark on an i9-12900K, using the performance governor and a recent enough kernel should do the trick already. There may be slight gains obtainable with manual core parking, but it'll likely require per-game tweaking to be effective.

afayaz-feral commented 11 months ago

416 should do this now, using frequency heuristics to determine P cores vs E cores. This is merged in the 1.8 release, give it a try and let us know.

Mikaka27 commented 8 months ago

Doesn't work for me on 13600k, this is what I get:

lut 27 19:44:07 stacjonarny gamemoded[68856]: cpu L3 cache was uniform, this is not a x3D with multiple chiplets
lut 27 19:44:07 stacjonarny gamemoded[68856]: cpu frequency was uniform, this is not a big.LITTLE type of system
lut 27 19:44:07 stacjonarny gamemoded[68856]: I can find no reason to perform core pinning on this system!
~ >>> lscpu                                                                                                                                                                                                                                                                                                          
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  20
  On-line CPU(s) list:   0-19
Vendor ID:               GenuineIntel
  Model name:            13th Gen Intel(R) Core(TM) i5-13600K
    CPU family:          6
    Model:               183
    Thread(s) per core:  2
    Core(s) per socket:  14
    Socket(s):           1
    Stepping:            1
    CPU(s) scaling MHz:  59%
    CPU max MHz:         5100,0000
    CPU min MHz:         800,0000
    BogoMIPS:            6991,00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl 
                         vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid r
                         dseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect user_shstk avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req hfi vnmi umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear seriali
                         ze pconfig arch_lbr ibt flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   544 KiB (14 instances)
  L1i:                   704 KiB (14 instances)
  L2:                    20 MiB (8 instances)
  L3:                    24 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-19
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected
~ >>> lscpu --all --extended                                                                                                                                                                                                                                                                                                
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0 0:0:0:0          tak 5100,0000 800,0000 1086,9570
  1    0      0    0 0:0:0:0          tak 5100,0000 800,0000 1097,8669
  2    0      0    1 4:4:1:0          tak 5100,0000 800,0000  947,1510
  3    0      0    1 4:4:1:0          tak 5100,0000 800,0000  800,0000
  4    0      0    2 8:8:2:0          tak 5100,0000 800,0000  800,0000
  5    0      0    2 8:8:2:0          tak 5100,0000 800,0000  800,0000
  6    0      0    3 12:12:3:0        tak 5100,0000 800,0000  940,7350
  7    0      0    3 12:12:3:0        tak 5100,0000 800,0000  800,1910
  8    0      0    4 16:16:4:0        tak 5100,0000 800,0000 1100,0031
  9    0      0    4 16:16:4:0        tak 5100,0000 800,0000 1099,2960
 10    0      0    5 20:20:5:0        tak 5100,0000 800,0000  800,0000
 11    0      0    5 20:20:5:0        tak 5100,0000 800,0000  800,6120
 12    0      0    6 24:24:6:0        tak 3900,0000 800,0000  799,2260
 13    0      0    7 25:25:6:0        tak 3900,0000 800,0000  799,9380
 14    0      0    8 26:26:6:0        tak 3900,0000 800,0000  800,1120
 15    0      0    9 27:27:6:0        tak 3900,0000 800,0000  800,0000
 16    0      0   10 28:28:7:0        tak 3900,0000 800,0000  800,0000
 17    0      0   11 29:29:7:0        tak 3900,0000 800,0000  799,9980
 18    0      0   12 30:30:7:0        tak 3900,0000 800,0000  799,8760
 19    0      0   13 31:31:7:0        tak 3900,0000 800,0000  800,0000
~ >>> cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq                                            
5100000
5100000
5100000
5100000
5100000
5100000
5100000
5100000
5100000
5100000
5100000
5100000
3900000
3900000
3900000
3900000
3900000
3900000
3900000
3900000
~ >>> gamemoded --version                                                                                  
gamemode version: v1.8.1

OS is current manjaro. Let me know what to add here, I'm not sure what is happening.

Sorry for the edit spam, in my case manual settings of pin_cores works fine, I've set it as:

[cpu]
pin_cores=0-11