zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.92k stars 6.65k forks source link

kernel/sched: Optimize handling for suspend(_current) #81737

Open andyross opened 16 hours ago

andyross commented 16 hours ago

k_thread_suspend() is a general call intended to halt any thread running in any state on any CPU, it's not particular lean. And it's not really intended for use a general purpose application API, as asynchronous halt/stop/kill APIs are inherently race prone (e.g. you can suspend a thread holding a resource, there's no locking around the "decide to suspend the thread" that can be atomically released with the suspend, etc... Semaphores and condition variables exist for a reason folks!).

But nonetheless, there's a notable benchmark in the tree now that uses exactly this API to mean "stop the current thread", and as it's a competitive thing we don't want to have to be "fixing" it for Zephyr.

So do a tiny, targetting bit of benchmark chasing: detect the case where we're suspending _current out of thread code on a non-SMP system and implement the minimal set of operations as an early exit. This isn't too big, has no impact on SMP builds, and should make us look better to the cook kids at school.

andyross commented 16 hours ago

As suggested in #81684, this is a shorter and more targetted attempt at "optimizing" a case that IMHO we shouldn't be optimizing at all. In theory[1] k_thread_suspend(), for the specific case exercised in the thread_metric preemptive benchmark, can't possibly get any faster than this as this represents the minimal amount of work for the operation.

And it's short and doesn't impact kernel design anywhere else, allowing us to evolve design of things like sleep/suspend slowly and along cleaner aesthetic paths (and not out of benchmark chasing).

My twister run isn't done yet but AFAICT this hasn't broken anything.

[1] I don't have a proper rig to measure it and I ain't gonna.

andyross commented 16 hours ago

(I do see a bunch of typos in the commit message and comment, but will wait for the CI results before re-pushing)

andyross commented 15 hours ago

So, I know I said I wasn't going to run real tests, but I couldn't resist. The test integration is super clean and obvious, and I have a frdm_k64f on my desk anyway.

Without this patch I get:

*** Booting Zephyr OS build v4.0.0-604-gf235ddb2cf26 ***
**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 30
Time Period Total:  6692286

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 60
Time Period Total:  6691546

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 90
Time Period Total:  6691402

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 120
Time Period Total:  6691545

And with it:

** Booting Zephyr OS build v4.0.0-605-g81199ac320e4 ***
**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 30
Time Period Total:  8257905

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 60
Time Period Total:  8256990

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 90
Time Period Total:  8256811

**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 120
Time Period Total:  8256989

So about 23% faster context switch[1]. Which IMHO is not so shabby for a 7-line patch with minimal impact.

FWIW: the result reporting of this benchmark is as weird as its API choice. The number out is the number of round trips through a "stack" of threads at different priorities. Where each one wakes up the one higher until the top of the stack, and then they all go to sleep on the way back down. So since there are five threads, that's four context switches to get to the top thread and four down, so each round trip is eight context switches.

And the number reported is the number of "eight-switch-round-trips" per time interval, which appears to be 30 seconds. So N/3.75 gives (modulo slop from measurement and rig execution) "context switches per second", or 3.75e9/N for "ns per context switch".

[1] Where "context switch" means "self-k_thread_suspend-time" and not the actual overhead of a Zephyr context switch. Again, dumb tests are dumb.

andyross commented 14 hours ago

Tests passed, so repush with a more recent rebase and typo fixing.