KarsMulder / evsieve

A utility for mapping events from Linux event devices.
GNU General Public License v2.0
199 stars 11 forks source link

Question: is realtime scheduling priority needed/advised on underpowered hosts? #41

Open callegar opened 5 months ago

callegar commented 5 months ago

Experimenting with evsieve on a 2-in-1 with a rather weak processor (Celeron N4120). Getting the impression that the keyboard gets a bit laggy when thunderbird is using 100% of the CPU. Not totally sure, because the keyboard is poor and tends to miss strokes anyway. However, the impression is there. Is chrt expected to be needed/advisable on evsieve?

KarsMulder commented 5 months ago

I don't know enough about scheduling to give a real answer to this question.

That said, actually processing the events shouldn't take that much CPU time even on an underpowered system. Evsieve spends most of its time waiting on the epoll_wait system call, which asks the kernel's scheduler to deschedule evsieve until one of the input devices has events available. When one of the input devices has events available again, the kernel should schedule the evsieve process to run again, but there is a small amount of time between the moment the event device receives events and the moment that evsieve gets its slice of the CPU time again. This is the biggest source of latency on my not-underpowered system.

I suppose that when you have a processor whose cores are all busy, the time between events becoming available on the input device and evsieve being assigned its slice of CPU time may become even bigger. When free cores are available, the scheduler just needs to start the evsieve process; when all cores are in use by Thunderbird, the scheduler also needs to wait until one of Thundersbird's threads gets descheduled.

The term "realtime scheduling" is usually associated with something different. According to the kernel docs:

Real-time scheduling is all about determinism, a group has to be able to rely on the amount of bandwidth (eg. CPU time) being constant. In order to schedule multiple groups of real-time tasks, each group must be assigned a fixed portion of the CPU time available. Without a minimum guarantee a real-time group can obviously fall short. A fuzzy upper limit is of no use since it cannot be relied upon. Which leaves us with just the single fixed portion.

As you can see, this form of "realtime scheduling" is intended to achieve something different from what evsieve actually needs: evsieve wants low latency, not guaranteed bandwith. Evsieve doesn't need a fixed amount of CPU time. Having 5% of the CPU's bandwith allocated is of no use if no input events are available anyway. It just wants its threads to be scheduled immediately after events become available on its input devices.

I suppose that the following properties of a scheduler could improve evsieve's responsiveness:

I think the realtime scheduler with a really small sched_rt_period_us could accomplish the second point by interrupting other processes really often, but I guess that would seriously degrade the performance of your system even when evsieve sits idle. Realtime scheduling with a large period (like the standard period of one second) would do approximately nothing to improve evsieve's performance.

Maybe there are some other settings on the scheduler that do accomplish either/both of the above in a more efficient way.

KarsMulder commented 5 months ago

In case you want to measure evsieve's performance yourself, here's a script that I use to benchmark the performance of a simple script that maps each key to the next one of the alphabet. In addition to evsieve, it also requires python-evdev to be installed.

It measures performance by creating a virtual input device, letting evsieve map that input device, and then comparing the difference between the timestamps that the kernel assigned to the events on the virtual input device and the virtual output device.

Since two processes need to read the virtual input device (evsieve to map them, and this script itself to record the timestamps of the events), the virtual input device cannot be grabbed, so it does cause some key presses to be observed by the rest of your system.

#!/usr/bin/env python3

import asyncio
import evdev
import evdev.ecodes as e
import os
import subprocess as sp
import time

# Put the path to your evsieve executable here.
EVSIEVE_PATH = "evsieve"

# Make sure that the output device path not already occupied.
OUTPUT_DEVICE_PATH = "/dev/input/by-id/benchmark-keyboard"
if os.path.exists(OUTPUT_DEVICE_PATH):
    print(f"Error: cannot executed benchmark: output device path {OUTPUT_DEVICE_PATH} exists")
    exit(1)
if os.path.islink(OUTPUT_DEVICE_PATH):
    os.unlink(OUTPUT_DEVICE_PATH)

ALPHABET = list("abcdefghijklmnopqrstuvwxyz")
NUM_KEYS_TO_SEND = 100
TIME_BETWEEN_KEYS = 0.1

# Create a device that we will send events into.
def create_input_device():
    capabilities = {
        e.EV_KEY: [
            e.ecodes["KEY_" + key.upper()]
            for key in ALPHABET
        ]
    }
    return evdev.UInput(capabilities, name="virtual-keyboard")

def spawn_evsieve(input_device):
    # Build the command we shall use to open evsieve.
    # Map every key to the next one in the ALPHABET, e.g. a->b, m->n, z->a
    evsieve_cmd = [
        EVSIEVE_PATH,
        "--input", input_device.device, "persist=exit",
    ]
    for i in range(len(ALPHABET)):
        evsieve_cmd += ["--map", f"key:{ALPHABET[i]}", f"key:{ALPHABET[(i+1)%len(ALPHABET)]}"]

    evsieve_cmd += ["--output", f"create-link={OUTPUT_DEVICE_PATH}"]

    # Start evsieve.
    evsieve_process = sp.Popen(evsieve_cmd)

    # Wait until evsieve is ready.
    while not os.path.exists(OUTPUT_DEVICE_PATH):
        time.sleep(0.1)

    return evsieve_process

# Sends events to the input device, then closes the input device when done.
async def send_events_then_close(device):
    for event_index in range(NUM_KEYS_TO_SEND):
        keycode = e.ecodes[f"KEY_{ALPHABET[event_index%len(ALPHABET)].upper()}"]

        device.write(e.EV_KEY, keycode, 1)
        device.syn()
        await asyncio.sleep(TIME_BETWEEN_KEYS / 2)

        device.write(e.EV_KEY, keycode, 0)
        device.syn()
        await asyncio.sleep(TIME_BETWEEN_KEYS / 2)

    # Give the other tasks some time to finish reading events before we exit.
    await asyncio.sleep(1.0)
    device.close()

# Measure the time of which the events that we can observe from the event devices.
async def read_events(device, list_to_write):
    # Read events until the device disappears.
    try:
        async for event in device.async_read_loop():
            if event.type == e.EV_KEY:
                list_to_write.append(event.timestamp())
    except OSError:
        pass

# Tell the user what the average difference between the input and output events is.
def present_report(timestamps_out, timestamps_loop):
    total_delta = 0
    count = 0
    for time_in, time_out in zip(timestamps_loop, timestamps_out):
        total_delta += (time_out - time_in)
        count += 1

    MICROSECONDS_PER_SECOND = 1000000
    print("")
    print(f"Average delta of {total_delta/count * MICROSECONDS_PER_SECOND} microseconds per event over {count} events.")

async def main():
    input_device = create_input_device()
    evsieve_process = spawn_evsieve(input_device)

    # Open a second copy of our input device so we can read events from it as if we were any other process.
    input_device_loop = evdev.InputDevice(input_device.device)

    # Open the device that evsieve created.
    output_device = evdev.InputDevice(OUTPUT_DEVICE_PATH)
    output_device.grab()

    # Start three tasks: one task writes events to the input devices, another task reads those events back and records
    # which timestamp the kernel assigned to them, and another task checks which timestamps the kernel assigned to the
    # events that evsieve wrote to its output device.
    timestamps_out = []
    timestamps_loop = []

    await asyncio.gather(
        send_events_then_close(input_device),
        read_events(output_device, timestamps_out),
        read_events(input_device_loop, timestamps_loop)
    )
    present_report(timestamps_out, timestamps_loop)

asyncio.run(main())