hamadmarri / cacule-cpu-scheduler

The CacULE CPU scheduler is based on interactivity score mechanism. The interactivity score is inspired by the ULE scheduler (FreeBSD scheduler).
266 stars 32 forks source link

Public Chat #21

Closed hamadmarri closed 3 years ago

hamadmarri commented 3 years ago

Hi Everyone

Consider this as an open thread to discuss and talk about Cachy/CacULE

I would post some patches for testing too.

We could talk about how to enhance the scheduler.

Thank you.

raykzhao commented 3 years ago

Hi @hamadmarri

The CacULE-rdb has been so far so good on my machines.

Two minor things:

  1. It is better to also make the cacule_max_lifetime tunable via sysctl.
  2. It is better to update the documentation and mention the functions that probably would not work correctly with the rdb. Currently I already found that the schedutil CPU governor would keep my CPU at the highest frequency most of the time with the rdb. From the source code, I suspect other functions that rely on the load balancing statistics would also be affected.
hamadmarri commented 3 years ago

Hi @hamadmarri

The CacULE-rdb has been so far so good on my machines.

Two minor things:

1. It is better to also make the `cacule_max_lifetime` tunable via sysctl.

2. It is better to update the documentation and mention the functions that probably would not work correctly with the rdb. Currently I already found that the `schedutil` CPU governor would keep my CPU at the highest frequency most of the time with the rdb. From the source code, I suspect other functions that rely on the load balancing statistics would also be affected.

Hi @raykzhao

It is better to also make the cacule_max_lifetime tunable via sysctl.

I will add the sysctl code for max_lifetime soon, thanks for the reminding :+1: .

Regarding the second point, I believe there are many features are not working with RDB, such as cpu clamp and others. I am not sure what exactly the affected features, but I'll try my best to mention the features that might be affected by RDB in the documentation.

Thank you

SSalekin commented 3 years ago

I wanted to share my geekbench results using CacULE (compared to generic). Thanks @hamadmarri for giving us this gem.

Generic (Ubuntu 20.04.2)

ubuntu

After installing Xanmod CacULE

Screenshot from 2021-02-08 01-13-11

hamadmarri commented 3 years ago

I wanted to share my geekbench results using CacULE (compared to generic). Thanks @hamadmarri for giving us this gem.

Generic (Ubuntu 20.04.2)

ubuntu

After installing Xanmod CacULE

Screenshot from 2021-02-08 01-13-11

Hi @SSalekin

Thank you for sharing this results with us. Most of the performance gained by xanmod patches itself. Thanks to XanMod. The cacule scheduler helps on responsiveness and latency of the system.

Thank you

mjeveritt commented 3 years ago

Is there any chance this can be backported to 5.4 LTS kernels, or does it depend on features only available in 5.9+ ? I note from upstream that 5.10 will likely go EOL fairly soon, and I believe 5.8/5.9 are also dead-in-the-water now?!

hamadmarri commented 3 years ago

Is there any chance this can be backported to 5.4 LTS kernels, or does it depend on features only available in 5.9+ ? I note from upstream that 5.10 will likely go EOL fairly soon, and I believe 5.8/5.9 are also dead-in-the-water now?!

Hi @mjeveritt

It doesn't depends on v5.9. It can be ported to 5.4. I will add 5.4 patch if I had time.

Thank you

phush0 commented 3 years ago

Is there any chance this can be backported to 5.4 LTS kernels, or does it depend on features only available in 5.9+ ? I note from upstream that 5.10 will likely go EOL fairly soon, and I believe 5.8/5.9 are also dead-in-the-water now?!

5.8 and 5.9 are EOL. 5.10 will be supported till 2022

phush0 commented 3 years ago

Hi @hamadmarri

The CacULE-rdb has been so far so good on my machines.

Two minor things:

  1. It is better to also make the cacule_max_lifetime tunable via sysctl.
  2. It is better to update the documentation and mention the functions that probably would not work correctly with the rdb. Currently I already found that the schedutil CPU governor would keep my CPU at the highest frequency most of the time with the rdb. From the source code, I suspect other functions that rely on the load balancing statistics would also be affected.

schedutil will act same with BMQ/PDS so it is not problem with CacULE alone

phush0 commented 3 years ago

I am testing CacULE on my old 4 core laptop with 8706G CPU, and what I found till now is that system look very smooth and responsive even during compilation of large project as kernel. What I don't like is that during gaming I see how from time to time one of cores are occupied with kernel load and there is heavy stutter during this time. It is strange because CPU is not loaded 100% but just like ~30 % but then sudden all load disappear from all other cores (they are near 0%) and one core is pegged on almost 100 % just with kernel load. I have linux-tkg compiled with voluntary preempting and cacule-rdb patch.

hamadmarri commented 3 years ago

I am testing CacULE on my old 4 core laptop with 8706G CPU, and what I found till now is that system look very smooth and responsive even during compilation of large project as kernel. What I don't like is that during gaming I see how from time to time one of cores are occupied with kernel load and there is heavy stutter during this time. It is strange because CPU is not loaded 100% but just like ~30 % but then sudden all load disappear from all other cores (they are near 0%) and one core is pegged on almost 100 % just with kernel load. I have linux-tkg compiled with voluntary preempting and cacule-rdb patch.

Hi @phush0

I won't recommend RDB in case of background load (kernel compilation) - plus gaming. RDB is still experimental load balancer. CacULE (without RDB) is well suited and well tested on XanMod. I am not sure how TKG patches work with CacULE. I suggest that you try CacULE first without RDB. If the problem is still happening then try xanmod-cacule version (without RDB). I found that RDB performs better on the mainline kernel, you could give it a try.

Thank you

xuanruiqi commented 3 years ago

Hi Hamad! Wonder if you are already working on CaCULE for 5.11? Thanks!

hamadmarri commented 3 years ago

Hi Hamad! Wonder if you are already working on CaCULE for 5.11? Thanks!

Hi @xuanruiqi

For the time being, I am not working on cacule. I am looking for a job since about 7 months, no lucks so far.

xuanruiqi commented 3 years ago

I see! Since I'm a heavy user of Cacule, I'll perhaps try to port this to 5.11. Hopefully it won't be too hard.

raykzhao commented 3 years ago

Hi @hamadmarri @xuanruiqi

I'm currently using CacULE-rdb with 5.11 kernel. It seems that after merging the upstream commits 3aef1551e942860a3881087171ef0cd45f6ebda7 and dc824eb898534cd8e34582874dae3bb7cf2fa008 in select_task_rq_fair, the CacULE-rdb 5.10 patch can be applied to 5.11.

xuanruiqi commented 3 years ago

@raykzhao If you could produce a patch and create a PR that would be awesome!

mjeveritt commented 3 years ago

Is there any chance this can be backported to 5.4 LTS kernels, or does it depend on features only available in 5.9+ ? I note from upstream that 5.10 will likely go EOL fairly soon, and I believe 5.8/5.9 are also dead-in-the-water now?!

Hi @mjeveritt

It doesn't depends on v5.9. It can be ported to 5.4. I will add 5.4 patch if I had time.

Thank you

I have done a trial merge of patches to 5.4 LTS series (using kernel.org git repo, branch 5.4.y). I will aim to PR this over the weekend.

phush0 commented 3 years ago

I am testing CacULE on my old 4 core laptop with 8706G CPU, and what I found till now is that system look very smooth and responsive even during compilation of large project as kernel. What I don't like is that during gaming I see how from time to time one of cores are occupied with kernel load and there is heavy stutter during this time. It is strange because CPU is not loaded 100% but just like ~30 % but then sudden all load disappear from all other cores (they are near 0%) and one core is pegged on almost 100 % just with kernel load. I have linux-tkg compiled with voluntary preempting and cacule-rdb patch.

Hi @phush0

I won't recommend RDB in case of background load (kernel compilation) - plus gaming. RDB is still experimental load balancer. CacULE (without RDB) is well suited and well tested on XanMod. I am not sure how TKG patches work with CacULE. I suggest that you try CacULE first without RDB. If the problem is still happening then try xanmod-cacule version (without RDB). I found that RDB performs better on the mainline kernel, you could give it a try.

Thank you

Removing RDB fixed my problem. Thank you very much, smoothest system ever.

hamadmarri commented 3 years ago

Is there any chance this can be backported to 5.4 LTS kernels, or does it depend on features only available in 5.9+ ? I note from upstream that 5.10 will likely go EOL fairly soon, and I believe 5.8/5.9 are also dead-in-the-water now?!

Hi @mjeveritt It doesn't depends on v5.9. It can be ported to 5.4. I will add 5.4 patch if I had time. Thank you

I have done a trial merge of patches to 5.4 LTS series (using kernel.org git repo, branch 5.4.y). I will aim to PR this over the weekend.

Cacule v5.4: https://github.com/hamadmarri/cacule-cpu-scheduler/tree/master/patches/CacULE/v5.4

Cacule v5.3 + (v5.3 with opensuse Leap 15.2 kernel patches): https://github.com/hamadmarri/cacule-cpu-scheduler/tree/master/patches/CacULE/v5.3

mjeveritt commented 3 years ago

Cacule v5.4: https://github.com/hamadmarri/cacule-cpu-scheduler/tree/master/patches/CacULE/v5.4

Cacule v5.3 + (v5.3 with opensuse Leap 15.2 kernel patches): https://github.com/hamadmarri/cacule-cpu-scheduler/tree/master/patches/CacULE/v5.3

Thanks - will give this a spin! :smiley: :+1:

xalt7x commented 3 years ago

Hi @hamadmarri I have a question about CFS Autogroup feature. Is it still needs to be disabled for non-RDB build? Because latest patches point specifically at CACULE_RDB as incompatible option for CGROUP_SCHED & SCHED_AUTOGROUP. Even Xanmod has both options enabled at the moment. Also 5.10 folder is confusing. You have few patches there. Reading this topic and looking at dates, I figured out that the last one is "cacule5.10-rdb.patch" but we need to disable CONFIG_RDB. If so, I guess it's better to delete "cacule5.10-r2.patch" , disable CACULE_RDB by default on "cacule5.10-rdb.patch" and rename it to "cacule-5.10.patch" (so it will match 5.3 and 5.4 patches naming). Thanks!

hamadmarri commented 3 years ago

Hi @hamadmarri I have a question about CFS Autogroup feature. Is it still needs to be disabled for non-RDB build? Because latest patches point specifically at CACULE_RDB as incompatible option for CGROUP_SCHED & SCHED_AUTOGROUP. Even Xanmod has both options enabled at the moment. Also 5.10 folder is confusing. You have few patches there. Reading this topic and looking at dates, I figured out that the last one is "cacule5.10-rdb.patch" but we need to disable CONFIG_RDB. If so, I guess it's better to delete "cacule5.10-r2.patch" , disable CACULE_RDB by default on "cacule5.10-rdb.patch" and rename it to "cacule-5.10.patch" (so it will match 5.3 and 5.4 patches naming). Thanks!

Hi @Alt37

Ok, I will fix the naming issue and disable rdb by default.

For cacule non-rdb, it is full feature support. It works with autogroup and cgroup_sched. My personal preference is to disable both as part of optimization.

Thank you

hamadmarri commented 3 years ago

@raykzhao @xuanruiqi

https://github.com/hamadmarri/cacule-cpu-scheduler/tree/master/patches/CacULE/v5.11

hf29h8sh321 commented 3 years ago

CacULE (rdb disabled) v5.10 corrupts sound under heavy system load. I will try with rdb enabled.

Update: CacULE v5.11 with rdb is significantly smoother.

B3HOID commented 3 years ago

Hello, I have some weird issue with CacULE 5.10rdb 5.11 port for Xanmod kernel

Playing Overwatch I notice that there is alot of stuttering and overall the FPS is just low (30-40) but that's with kernel.sched_interactivity_factor set to any number that isn't 0. But with it set to 0 the game stops stuttering and the FPS goes back to normal.

I wonder if the ULE interactivity score mechanism design has anything to do with this because it seems that having it set at a particular value besides 0 just hurts my performance.

hamadmarri commented 3 years ago

Hello, I have some weird issue with CacULE 5.10rdb 5.11 port for Xanmod kernel

Playing Overwatch I notice that there is alot of stuttering and overall the FPS is just low (30-40) but that's with kernel.sched_interactivity_factor set to any number that isn't 0. But with it set to 0 the game stops stuttering and the FPS goes back to normal.

I wonder if the ULE interactivity score mechanism design has anything to do with this because it seems that having it set at a particular value besides 0 just hurts my performance.

Hi @B3HOID

Could you please confirm that no background load while playing such as shaders compiling?

B3HOID commented 3 years ago

Could you please confirm that no background load while playing such as shaders compiling?

Actually, this lag occurs when the shaders are compiling. But ya3ni it doesn't happen with normal Xanmod that uses CFS, or MuQSS/PDS (some other CPU schedulers I have tried)

hamadmarri commented 3 years ago

Could you please confirm that no background load while playing such as shaders compiling?

Actually, this lag occurs when the shaders are compiling. But ya3ni it doesn't happen with normal Xanmod that uses CFS, or MuQSS/PDS (some other CPU schedulers I have tried)

Please check this discussion: https://forum.xanmod.org/thread-2-post-7420.html#pid7420

hamadmarri commented 3 years ago

Could you please confirm that no background load while playing such as shaders compiling?

Actually, this lag occurs when the shaders are compiling. But ya3ni it doesn't happen with normal Xanmod that uses CFS, or MuQSS/PDS (some other CPU schedulers I have tried)

Is harsh_mode enabled?

hamadmarri commented 3 years ago

New CacULE logo, Thanks to Alexandre.

cacule_logo_2021

The logo idea is relevant if you think in Russian language

phush0 commented 3 years ago

I have a hard crash if compile kernel with full preempt

мар 05 12:38:10 kernel: general protection fault: 0000 [#1] PREEMPT SMP NOPTI
мар 05 12:38:10 kernel: CPU: 13 PID: 0 Comm: swapper/13 Tainted: P S   U     OE     5.10.20-131-tkg-cfs #1
мар 05 12:38:10 kernel: Hardware name: Razer Blade 15 Studio Edition (Early 2020) - RZ09-033/CH551, BIOS 1.06 09/16/2020
мар 05 12:38:10 kernel: RIP: 0010:usb_hcd_map_urb_for_dma+0xc9/0x500
мар 05 12:38:10 kernel: Code: f6 8b 95 80 00 00 00 f6 c4 02 41 0f 95 c6 41 ff c6 85 d2 75 15 45 31 ed 48 83 c4 08 5b 5d 41 5c 44 89 e8 41 5d 41 5e 41 5f c3 <a8> 04 75 e7 49 83 bc 24 58 02 00 00 00 0f 84 13 01 00 00 41 89 d7
мар 05 12:38:10 kernel: RSP: 0018:ffffada9004c4de0 EFLAGS: 00010202
мар 05 12:38:10 kernel: RAX: 0000000000000200 RBX: ffff9d5320b14940 RCX: ffff9d5303be1ca0
мар 05 12:38:10 kernel: RDX: 0000000000000002 RSI: ffff9d5328857d40 RDI: ffff9d530946e000
мар 05 12:38:10 kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 0: 9400004000040150
мар 05 12:38:10 kernel: mce: [Hardware Error]: TSC b5e53af43ca ADDR 1ffff90f68647

no problem if I compile with voluntary preempt

hamadmarri commented 3 years ago

5.10.20-131-tkg-cfs

Hi @phush0

It seems that you are using tkg patches. Full preempt is working fine on vanilla kernel/xanmod + cacule/+rdb.

RIP: 0010:usb_hcd_map_urb_for_dma+0xc9/0x500

I don't think that the issue is related to the scheduler.

EDIT: Could you please try with full preemption + disable RDB.

Thank you

phush0 commented 3 years ago

I am not using RDB at all, fact that without full preempt three days later I have no crashes with same peripheral attached. TKG patches are zen patches nothing more

hamadmarri commented 3 years ago

Notice that xanmod uses kernel.sched_harsh_mode_enabled = 1 by default. If you encounter sound/or any interrupts while system load it is because of the harsh mode, you need to disable it by:

sudo sysctl kernel.sched_harsh_mode_enabled=0

harsh mode is good to be the default value if you are not compiling stuff. It enhances the responsiveness and sometimes it increases the performance. But you need to pay attention to it in case of experiencing any interrupts while heavy system load.

phush0 commented 3 years ago

There is a strange effect with CacULE. When I use virtual machine, start is extremely slow, one CPU is bound to kworker-rcu and stay there like 2-3 to 5 minutes, tested with clean build with no other patches effect is same. If I use BMQ - QEMU is better than real pc - windows start bellow 5 seconds. As CacULE is extremely smooth I had to leave it behind, because of this problem. Effect is same with 5.10 and 5.11. I had fair scheduler enabled and autogroup enable but autogroup default is disabled as commented above

hamadmarri commented 3 years ago

There is a strange effect with CacULE. When I use virtual machine, start is extremely slow, one CPU is bound to kworker-rcu and stay there like 2-3 to 5 minutes, tested with clean build with no other patches effect is same. If I use BMQ - QEMU is better than real pc - windows start bellow 5 seconds. As CacULE is extremely smooth I had to leave it behind, because of this problem. Effect is same with 5.10 and 5.11. I had fair scheduler enabled and autogroup enable but autogroup default is disabled as commented above

Harsh_mode was enabled?

hamadmarri commented 3 years ago

Closed because the page became large, and hard to load since no pagination.

xuanruiqi commented 3 years ago

FYI, there's the new GitHub discussions. Maybe you could try it out?

hamadmarri commented 3 years ago

FYI, there's the new GitHub discussions. Maybe you could try it out?

https://github.com/hamadmarri/cacule-cpu-scheduler/discussions/23

Thank you