Add core::Timer implementation based on timerfd

gavv commented 4 years ago

Intro

core::Timer is a simple class that allows to wait for deadline expiration in one thread and to update the deadline from other thread(s), concurrently.

https://github.com/roc-project/roc/blob/develop/src/modules/roc_core/timer.h https://github.com/roc-project/roc/blob/develop/src/modules/roc_core/timer.cpp

Currently we have a generic implementation based on a seqlock and a semaphore with timed-wait operation. It works well, but has a disadvantage: changing deadline causes thread wake up. The waiting thread has to wake up just to see that the deadline is changed and to go to sleep with the new deadline.

It is possible to avoid this unnecessary wake up by implementing core::Timer using timerfd. This implementation will be Linux-specific, of course.

core::Timer is used in ctl::TaskQueue. This optimizations will allow to avoid unnecessary wake ups of the control thread when a control task is scheduled or cancelled.

Steps

Add new target directory target_notimers. See #246 for details on adding a new target directory. Enable it in SConstruct for non-Linux platforms.
Move current core::Timer implementation into roc_core/target_notimers.
Add a new core::Timer implementation based on timerfd and place it into roc_core/target_linux.
Ensure that all tests are passing. Also check roc_ctl benchmarks.
Not necessary, but would be nice to add a (meaningful) benchmark for core::Timer and compare results for semaphore- and timerfd-based implementations. Example benchmarks: 1, 2.

Requirements

These requirements are already met in semaphore-based core::Timer, and we should preserve them in the new implementation:

try_set_deadline() should be thread-safe. On concurrent calls, all calls except one are allowed to fail, and at least one should guaranteedly succeed.
try_set_deadline() should be lock-free, except the timerfd syscall, which probably can't be called "lock-free", but hopefully is something close to it. try_set_deadline() should not block the caller and should work well when called concurrently.
try_set_deadline() should try to avoid the timerfd syscall if wait_deadline() is not currently waiting.

moon9 commented 4 years ago

Hello,

I want to help solve this issue.

gavv commented 4 years ago

Great, you're welcome.

moon9 commented 4 years ago

Make pull request with changes #394

Followed the requirements as far as possible. But

try_set_deadline() should try to avoid the timerfd syscall if wait_deadline() is not currently waiting.

Seems impossible.

gavv commented 4 years ago

Seems impossible.

Could you elaborate? Have you looked how this is achieved with the current Timer implementation? (see the case when nextwakeup is zero).

moon9 commented 4 years ago

In next case: Before call wait_deadline(), have try_set_deadline(lower_value_of_time) with ignoring syscall. After that call wait_deadline(), but timestamp < deadline and it wait until other call. It mean we drop 1 deadline, which can be important and block to long time, if next deadline will set not soon.

Test not pass with that logic.

moon9 commented 4 years ago

Complexity logic can help with block on read call, but it may overhead perfomance more than syscall.

gavv commented 4 years ago

Before call wait_deadline(), have try_set_deadline(lower_value_of_time) with ignoring syscall. After that call wait_deadline(), but timestamp < deadline and it wait until other call. It mean we drop 1 deadline, which can be important and block to long time, if next deadline will set not soon.

Sorry, I didn't get it. Could you rephrase it and / or provide more specific example?

Here is how current Timer works, essentially.

try_set_deadline:

set deadline_ seqlock to new value
read nextwakeup seqlock
if nextwakeup is zero, just return
otherwise, post semaphore

wait_deadline:

set nextwakeup seqlock to non-zero
read and check deadline_ seqlock
sleep on semaphore if necessary
set nextwakeup to zero and return

(This is a simplification, the actual code is a bit complicated, but this simplified algorithm should be enough for our discussion.)

This algorithm provides the following important invariant:

if nextwakeup is zero, it is guaranteed that waitdeadline() is not sleeping and will read deadline (and take it into account) before going to sleep

This makes the above implementation of try_setdeadline() safe. It first updates deadline (with full memory fence) and only then reads nextwakeup (also with a full fence). If it sees zero nextwakeup, it can be sure that wait_deadline() is not sleeping it and WILL see the updated deadline before trying to sleep, so the updated deadline will be correctly handled even without posting the semaphore.

This allows us to avoid unnecessary semaphore syscall when try_set_deadline() is called when nextwakeup() is not running. Why bother? Because syscalls are expensive; a minimal syscall is something around 300ns, while a couple of atomic stores and loads is something around 20-30ns on modern Intel.

This way we can make the try_set_deadline() 10 times faster in the loaded use case, i.e. when TaskQueue users often enqueue new tasks and thus call try_set_deadline(), and TaskQueue processing thread is awake and rarely calls wait_deadline() because it's processing pending tasks.

(An interesting detail: actually, the same optimization is already present in some semaphore implementations; e.g. it's not necessary on glibc. However, relying on this is not portable, e.g. there is no such optimization in mach kernel semaphores which we use on macOS).

Are there any reasons why we can't use a similar approach for timerfd implementation? When try_set_deadline() sees that wait_deadline() is not running, it could omit the syscall. When wait_deadline() sees that the deadline is in future and it needs to sleep, it could update timerfd expiration by itself.

This way, if TaskQueue thread is awake (the "loaded" use case), the timerfd syscall will be either avoided it all or at least moved from try_set_deadline() to wait_deadline(). The latter is also good because tasks are enqueued and thus try_set_deadline() is called from real-time threads (e.g. audio and network), while wait_deadline() is called from TaskQueue thread which is non-real-time and is intended for low-priority work.

moon9 commented 4 years ago

Ok, I understand now how it must work. Make first reference and it works. So make new delta on review.

gavv commented 4 years ago

Unassigning so that someone could take a look at it and maybe find better solution. See discussions in the linked PR.

dshil commented 1 year ago

I've unassigned from my self, since working on other tasks now.

roc-streaming / roc-toolkit