hamadmarri / TT-CPU-Scheduler

Task Type (TT) is an alternative CPU Scheduler for linux.
107 stars 12 forks source link

Global runqueue #12

Open hamadmarri opened 2 years ago

hamadmarri commented 2 years ago

Branch: https://github.com/hamadmarri/linux-baby/tree/tt-grq

Welcome back to GRQ again. I still have some hope to implement global runqueue on current CFS. It is not an easy task apparently. I have reimplemented the old GRQ with slight modification. Don't use this commit 385bff302e395a88306fba3823c21d231c8c31f2 since it freezes like the old GRQ implementation on Cachy/CacULE.

However, in 11962a4e7cc20dd5fa9539d88c7a884b93995301 is a try to fix freezing issue. I am testing it right now, but as usual whenever I believe it is fixed it will surprise me with a freeze at random time/day! But so far no freezes yet.

If you want to test, make sure that you don't have any important files on your computer since the GRQ freezes can cause damage to FS. Also, don't use GRQ while your are doing important tasks

Please let me know if you got any freezes with this commit and later 11962a4e7cc20dd5fa9539d88c7a884b93995301 and whether it has better performance/latency

Thank you

raykzhao commented 2 years ago

Hi @hamadmarri,

I am testing with the latest commit (as the time of writing). So far I have found:

  1. NO_HZ_FULL does not work. If CONFIG_NO_HZ_FULL=y, I got kernel fault during very early stage in booting, no matter whether any CPU is listed after nohz_full. I cannot get the kernel log since this happens before the logger starts running.
  2. I get the following warning in kernel log when using NO_HZ_IDLE. It does not affect the system though.
    [    0.061781] ------------[ cut here ]------------
    [    0.061783] rq->clock_update_flags < RQCF_ACT_SKIP
    [    0.061784] WARNING: CPU: 2 PID: 0 at kernel/sched/sched.h:1486 pick_next_task_fair+0x5de/0x640
    [    0.061790] Modules linked in:
    [    0.061792] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.15.5-tt-grq #4
    [    0.061794] Hardware name: Acer Nitro AN515-51/Freed_KLS, BIOS V1.13 12/26/2017
    [    0.061795] RIP: 0010:pick_next_task_fair+0x5de/0x640
    [    0.061798] Code: a2 9c 00 0f 0b e9 5b fb ff ff 80 3d 44 07 46 01 00 0f 85 ee fd ff ff 48 c7 c7 58 99 4e 94 c6 05 30 07 46 01 01 e8 85 a2 9c 00 <0f> 0b 49 8b 85 80 02 00 00 e9 cd fd ff ff 80 3d 16 07 46 01 00 0f
    [    0.061800] RSP: 0000:ffffb9bd000fbe48 EFLAGS: 00010046
    [    0.061802] RAX: 0000000000000000 RBX: ffff94b8aed26400 RCX: 0000000000000000
    [    0.061803] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    [    0.061804] RBP: ffff94b5409f2ba8 R08: 0000000000000000 R09: 0000000000000000
    [    0.061805] R10: 0000000000000000 R11: 0000000000000000 R12: ffff94b8aed26480
    [    0.061806] R13: ffff94b5409f2b80 R14: ffff94b8aec26400 R15: ffff94b8aed26d68
    [    0.061807] FS:  0000000000000000(0000) GS:ffff94b8aed00000(0000) knlGS:0000000000000000
    [    0.061808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.061809] CR2: 0000000000000000 CR3: 00000003a8610001 CR4: 00000000003706e0
    [    0.061811] Call Trace:
    [    0.061812]  <TASK>
    [    0.061813]  __schedule+0x1a4/0x800
    [    0.061818]  schedule_idle+0x21/0x40
    [    0.061820]  do_idle+0x152/0x280
    [    0.061822]  cpu_startup_entry+0x14/0x40
    [    0.061824]  secondary_startup_64_no_verify+0xb0/0xbb
    [    0.061826]  </TASK>
    [    0.061828] ---[ end trace dda0c4188e035fd6 ]---
hamadmarri commented 2 years ago

Hi @hamadmarri,

I am testing with the latest commit (as the time of writing). So far I have found:

1. `NO_HZ_FULL` does not work. If `CONFIG_NO_HZ_FULL=y`, I got kernel fault during very early stage in booting, no matter whether any CPU is listed after `nohz_full`. I cannot get the kernel log since this happens before the logger starts running.

2. I get the following warning in kernel log when using `NO_HZ_IDLE`. It does not affect the system though.
[    0.061781] ------------[ cut here ]------------
[    0.061783] rq->clock_update_flags < RQCF_ACT_SKIP
[    0.061784] WARNING: CPU: 2 PID: 0 at kernel/sched/sched.h:1486 pick_next_task_fair+0x5de/0x640
[    0.061790] Modules linked in:
[    0.061792] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.15.5-tt-grq #4
[    0.061794] Hardware name: Acer Nitro AN515-51/Freed_KLS, BIOS V1.13 12/26/2017
[    0.061795] RIP: 0010:pick_next_task_fair+0x5de/0x640
[    0.061798] Code: a2 9c 00 0f 0b e9 5b fb ff ff 80 3d 44 07 46 01 00 0f 85 ee fd ff ff 48 c7 c7 58 99 4e 94 c6 05 30 07 46 01 01 e8 85 a2 9c 00 <0f> 0b 49 8b 85 80 02 00 00 e9 cd fd ff ff 80 3d 16 07 46 01 00 0f
[    0.061800] RSP: 0000:ffffb9bd000fbe48 EFLAGS: 00010046
[    0.061802] RAX: 0000000000000000 RBX: ffff94b8aed26400 RCX: 0000000000000000
[    0.061803] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    0.061804] RBP: ffff94b5409f2ba8 R08: 0000000000000000 R09: 0000000000000000
[    0.061805] R10: 0000000000000000 R11: 0000000000000000 R12: ffff94b8aed26480
[    0.061806] R13: ffff94b5409f2b80 R14: ffff94b8aec26400 R15: ffff94b8aed26d68
[    0.061807] FS:  0000000000000000(0000) GS:ffff94b8aed00000(0000) knlGS:0000000000000000
[    0.061808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.061809] CR2: 0000000000000000 CR3: 00000003a8610001 CR4: 00000000003706e0
[    0.061811] Call Trace:
[    0.061812]  <TASK>
[    0.061813]  __schedule+0x1a4/0x800
[    0.061818]  schedule_idle+0x21/0x40
[    0.061820]  do_idle+0x152/0x280
[    0.061822]  cpu_startup_entry+0x14/0x40
[    0.061824]  secondary_startup_64_no_verify+0xb0/0xbb
[    0.061826]  </TASK>
[    0.061828] ---[ end trace dda0c4188e035fd6 ]---

Hi @raykzhao

Sorry, I forgot to mention that it only works on HZ_PERIODIC. I am not sure yet how how would I make it works on NO_HZ_FULL. But for nohz idle, I can give it a try to wakeup idle cpu if there are tasks in the global runqueue. I am surprised that it is working fine with nohz idle! It seems not much to be done to make it work properly.

Thank you for testing :+1:

hamadmarri commented 2 years ago

not stable :-1: I got freezes. I am trying on RT kernel which shows the freezes right after the boot. Also for RT I need to use raw spin lock instead of spin lock

hamadmarri commented 2 years ago

I am going to use global rq points to the rq of cpu0, as the convention used in MuQSS. Although the separate lock is very lightweight since it is only locked in very critical sections but it is unstable and the scheduler depends on the stats of each rq in almost everywhere.

This adjustments (if it worked correctly) will reduce the performance a bit, but then I will look at and remove some useless migrations (from global rq perspective) which can reduce the locking contentions.

hamadmarri commented 2 years ago

Another idea which is stable: Candidate Balancer (CB). Each rq propose candidate task which has the highest HRRN. It is somehow similar to RDB but it is mixed with the normal TT basic balancer (which is based on number of tasks). CPUs will try to pull a candidate task (if it has higher hrrn), if no candidate -for idle cpus- it will just pull a task from a cpu that has maximum number of tasks as usual TT balancer does. The TT balancer acts like a fallback balancer in case no provided candidates. The candidate cannot be a cpu bound task or kthread.

I am testing it right now on RT kernel. Both throughput and the latency are better.

Here are two patches: RT: tt-rt-cb.patch.zip

Non-RT: tt-cb.patch.zip

should be patched on top of tt.

Thank you

RiverOnVenus commented 2 years ago

Hi @hamadmarri I patched tt-rt-cb on top of tt-5.15-r2, When I compile, I get the following

kernel/sched/bs.c:1019:16: error: use of undeclared identifier 'sd'
        schedstat_inc(sd->ttwu_move_affine);
hamadmarri commented 2 years ago

Hi @hamadmarri I patched tt-rt-cb on top of tt-5.15-r2, When I compile, I get the following

kernel/sched/bs.c:1019:16: error: use of undeclared identifier 'sd'
        schedstat_inc(sd->ttwu_move_affine);

Hi @RiverOnVenus

I reverted the commits here https://github.com/hamadmarri/linux-baby/commits/tt-rt-candidate

you can just comment the line out, the wake affain weight show worse results in my case

hamadmarri commented 2 years ago

Update cb-1.3:

https://github.com/hamadmarri/linux-baby/commit/1520c89923257aaeb27072dd5308203e065342e3

tt-cb-1.3.patch.zip

tt-rt-cb-1.3.patch.zip

hamadmarri commented 2 years ago

@raykzhao @RiverOnVenus

This is a guaranteed stable GRQ implementation.

tt-grq.patch.zip

Works on top of TT and TT-rt

I used first cpu to be the global rq, to check:

sudo dmesg | grep -i "Global runqueue is on cpu"
[    0.057842] Global runqueue is on cpu 0

Please test. I am going to make both candidate and grq as optional for TT v0.3.

So far I got the highest stress-ng results on GRQ

stress.txt

Thank you

hamadmarri commented 2 years ago

https://github.com/hamadmarri/TT-CPU-Scheduler/tree/master/patches/next/5.15

hamadmarri commented 2 years ago

GRQ update:

https://github.com/hamadmarri/linux-baby/commit/fa59b5e1727a0e7cae492eb3066e03cf1e3ad952

https://github.com/hamadmarri/linux-baby/commit/3f2c54e8f979d98bc91c1f6b055b4d0e4cf22da4

https://github.com/hamadmarri/TT-CPU-Scheduler/releases/tag/0.3.1

Now it is better with both nohz full/idle

I also fixed some bugs for non-idle cpu balancing with GRQ.

Thank you

RiverOnVenus commented 2 years ago

GRQ update:

hamadmarri/linux-baby@fa59b5e

hamadmarri/linux-baby@3f2c54e

0.3.1 (release)

Now it is better with both nohz full/idle

I also fixed some bugs for non-idle cpu balancing with GRQ.

Thank you

Obviously. GRQ in v0.3.1 is better than v0.3 in my case

Both in terms of latency and throughput

hamadmarri commented 2 years ago

GRQ updated to v0.3.2:

https://github.com/hamadmarri/linux-baby/commit/aaa9164d4c3fb3402ee462274e55465666148db9

Bench with python responsive script: noload.txt

withload.txt

LethalManBoob commented 1 year ago

It seems the GRQ cheduler is very bad for performance. Enabling it halfs my fps. https://github.com/hamadmarri/TT-CPU-Scheduler/issues/20