Per-module event queue - Githubissues

patrick96 commented 3 years ago

This is part of a larger effort to rework the main loop (#1778)

The action handler in a module runs in the caller thread (the controller thread) and not in the module thread. This requires us to be very careful about concurrent access and synchronization. Some modules solve this by creating a new connection to their adapter whenever an action is called (i3, mpd). For most other modules it is not clear whether the action code is actually save, it just seems to work (in almost all cases the action code will be called while the module sleeps).

I think we should do three things:

Create an event queue inside each module. When the controller wants to deliver an action, it can put that action into the event queue and wake the module (if it is sleeping). After waking up the module works through all received actions (in the module thread). We can either use the lock-free queue we are already using in the controller or just have a lock on the queue.
Clearly define the interface of a module and which functions may be called from other threads and which must be called from the module thread.
Ensure that all modules adhere to that interface and don't cause concurrency issues.

The general idea is to put as much as possible into the module threads and let the controller communicate with the module through only a narrow interface. If possible, using only atomic primitive variables or small locks around certain variables (e.g. the module output).

patrick96 commented 3 years ago

I don't think doing this is actually possible.

The problem is waking up the module after something was placed in the queue. This is fairly easy if the module sleeps on the m_sleephandler condition variable. But for modules that idle in a different way (for example polling a socket), this is not possible. There we must wait for the module to finishing idling.

Because of that, having an event loop doesn't work for all modules.

To make reasoning about concurrency easier and to make modules less complex, we should strive to guarantee the following:

The module implementations should never have to acquire locks unless they do their own concurrency separate from the controller-module concurrency.
Some functions may have restrictions on what they can access inside the class

Ideally, the module implementation should not be aware of concurrency, mutual exclusion should be guaranteed by the base modules. However, this may not be that easy.

Using the same lock for update and input is required for proper synchronization. At the moment, get_output should also use the same lock because modules access local state there. #1752 may help here because it would allow us to do all of get_output the same way for all modules, so we can get granular with locks.

But either way, we don't want the controller to stall when update is waiting for data (for example from the network).

One way to do this is to split update into two functions:

A collector function where the module collects the raw data
The actual update function that uses that raw data to set the module state and configure the labels.

We don't need any synchronization between these two functions because they run sequentially in the module thread.

Whatever code is responsible for collecting the data should be able to run concurrently with the get_output and build function (i.e. use different locks). The first approach that comes to mind would be to have each module define a data structure that can contain all raw data for that module. The type for this data structure could for example be passed as a template parameter to the superclass to make copying of that structure generic.

This leaves one single obstacle: The collector function may need to access local state that can also be accessed by action handlers.

For example, it needs to access the adaptor which the action handler also needs (e.g. pulseaudio module). Here is where the notion that modules should not use locks starts to break down. I don't see a way to get around this without non-trivial reasoning about concurrency in the module implementations right now.

We could use this collector function only for modules where we know certain calls can take some time and then reason about concurrency on a case-by-case basis. This is less error prone but could allow for functions to slip through the cracks.

EDIT:

Modules need to provide a way to get woken up anyway because when shutting down, the controller has to stop the modules and the module then need to first get woken up before the thread can be joined.

We need to make this more explicit.

For many modules waking up just means notifying a condition variable. For others it might be more difficult. Modules that using poll or pselect could introduce an extra file descriptor to wait on, when waking up that file descriptor is written to to wake up the polling call.

Not all modules fall into these categories, but for every module we need to find a way to wake it up when idling.

This also means the event queue approach is feasible again.

patrick96 commented 2 years ago

Alright, I feel like I'm changing my mind about this every few months or so.

Since we now have a proper eventloop with libuv, we could try to integrate modules into that main eventloop without having to run a separate thread for them. This can probably be done module-by-module or at least all timer modules at once and then all event modules at once.

Even if we don't do that immediately, modules should at least do the following:

In the worker thread, it waits for some kind of update (timer, socket, etc.)
When such an update occurs, it collects the data and stores them in the instance and notifies the controller of the update
The controller calls contents() on the module
The module uses the data stored by the worker thread to determine its format and token values and replaces the tokens in the labels.

This way, the only state shared between the worker thread and the main thread is that data object which is much easier to protect. This also makes it easier to extract the update and data querying logic into a separate class (for example as script_runner already does) and only run that class in a separate thread to make the separation of data between the threads more explicit.

EDIT:

Modules that do any kind of polling in their thread would also get more efficient because we can do polling without a timeout. Currently most modules implement polling with a timeout because we need to periodically check if the bar is shutting down. If the module's polling is integrated into the libuv eventloop, we get this for free.

polybar / polybar

Per-module event queue #2337