[RFC] Configuration file

I think it'd be great if instead of passing every value on CLI, the user could specify a config file.

One immediate tangible benefit is we can then have a static evsieve.service file (for those who use systemd) which packagers can install, and the user edits the configuration file in /etc instead of the service file. The whole configuration process would be somewhat more "Linux-idiomatic" this way.

I took a brief look at the current parser code and it seems fairly simple to reuse it. And I am willing to try implement this feature. However I have some questions regarding how to best move forward, hence this issue.

Question 1: config file format

It appears to me that the simplest and most consistent way to do this is to have the exact same format in the configuration file as in the command line. E.g.

/etc/evsieve/main.conf

# Maybe allow comments too
--input /dev/input/by-id/keyboard grab
--hook key:leftctrl key:h exec-shell="echo Hello, world!"
--output create-link=/dev/input/by-id/virtual-keyboard

evsieve --config /etc/evsieve/main.conf

I would tokenize this with a simple lexer and feed it directly to arguments::parser::implement. Are there any obvious foot guns I'm missing with this approach? Is there perhaps already a good lexer crate that I can use?

Question 2: parsing the config file

There's the question of "how much --config usage should we allow?". From least complex to most, we can:

Make passing a config file mutually exclusive with other CLI arguments.
Allow mixing config files and other CLI arguments, but don't allow nested --config declaration in config files.
Full recursive loading of nested --config options.

The first option is simplest but least extensible. The benefit is it precludes any possibility of unexpected parsing order issues.
The third option is most extensible, perhaps even allowing composition-like config organisation. So for example, you can have your keyboard configs in one file and mouse configs in another, and have main.conf load both.
And the second option is somewhere inbetween.

The question is of course what's the correct tradeoff between simplicity and feature.

Question 3: implementation details

Two options here:

Add an additional variant Config to arguments::parser::Argument, and let arguments::parser::implement handle it.
Make the --config flag a special "macro-like" option, in the sense that all loading (recursively if necessary) in arguments::parser::parse.

Personally I actually like the second option more, since it makes a clean separation of parsing (CLI and files) and implementing. But I would like to hear others' opinions first.

This is indeed a good idea.

I was going to write up some guidelines regarding its implementation, but then came to the conclusion that it would actually take less time to just implement it than to write out an implementation manual. Sorry in case you were looking forward to working on this.

So a rudimentary implementation is now in the master branch. You need to compile evsieve with the config feature enabled to use the new argument:

cargo build --release --features config

(I am using these feature gates to prevent features that are not yet ready for stabilization from delaying an 1.4 release.)

Question 1

I think that letting the (default) configuration file contain essentially arguments for the evsieve program may indeed be the best option. Any other option would essentially require designing a second API.

Like you suggest, I've implemented --config PATH such that the file at PATH gets read and the --config argument gets replaced by the arguments in that file:

For example, if you put the following files on disk:

# In file /etc/evsieve/main
--config /etc/evsieve/input
--config /etc/evsieve/mappings
--config /etc/evsieve/output

# In file /etc/evsieve/input
--input /dev/input/by-id/keyboard

# In file /etc/evsieve/mappings
--map key:a key:b yield
--map key:b key:a yield

# In file /etc/evsieve/output
--output create-link=/dev/input/by-id/virtual-keyboard

Then invoking evsieve as evsieve --config /etc/evsieve/main is equivalent to evoking evsieve as

evsieve --input /dev/input/by-id/keyboard \
        --map key:a key:b yield \
        --map key:b key:a yield \
        --output create-link=/dev/input/by-id/virtual-keyboard

Maybe in the future we may want to support some configuration formats used by third-party programs as well. We could use --config format=... for those cases.

In a proper implementation, the configuration files would get lexed the same way the shell lexes them, i.e. parsing --hook key:a exec-shell="echo Hello, world" as three tokens: --hook, key:a, exec-shell=echo Hello, world. However, I haven't gotten around to implementing a shell lexer yet, which is nontrivial partially because not even all shells agree on how arguments should be lexed. (E.g. what does echo FOO#BAR print? I've seen different parsers disagree on whether the #BAR part is a comment, or part of a single FOO#BAR token.)

No proper shell lexer has been implemented yet, so for now putting clauses like exec-shell="echo Hello, world" in the configuration file won't work.

So far I've been trying to minimize the amount of dependencies I use, because evsieve is a program running as root and I don't want near-npm levels of untrusted dependencies in it. On one hand I don't want to include another dependency for a feature most people won't use, on the other hand, maybe writing such a lexer by hand is just too error-prone...

(I've been wondering for a long time whether I should get a bit looser with the dependencies. It is seriously slowing down some developments.)

Question 2

Option three seems like the only sensible option, because I don't like edge cases. It's not that difficult to implement either. It may make a hypothetical future feature of being able to update configuration files without restarting evsieve a bit more convoluted to implement, but allowing recursive configuration files makes it for example possible to separate the file that defines the input devices from the file that defines the mappings, so I suppose it is worth including support for it.

An issue potentially related to the shell lexer question above is: do we want to allow glob expansion inside the configuration files as well? That would make it possible to do something like --config /etc/evsieve.d/*? If we made that possible, in what order should the files in the glob expansion show up? (Sorted by codepoints, I suppose?)

While we're at it, do we expand environment variables in the configuration files, like --map $SOURCE_KEY $TARGET_KEY?

Some thoughts on reloading configuration files

If we're adding support for configuration files, then I suppose it would be nice to be able to be able to update those configuration files without having to restart evsieve as well. However, the codebase of evsieve doesn't currently have the infrastructure to change the applicable mappings after initialization, so that would be another big feature after configuration scripts.

There two levels of possible support for reloading configuration files:

Reloading a configuration file changes the mappings that are applicable at runtime, but resets the internal state (e.g. all --toggles get put back to their default state.)
Reloading a configuration file changes the mappings that are applicable at runtime, and tries to retain the internal state as much as possible.

As you might guess, the second option is more complicated to implement than the first. Not impossible, but its going to require some refactoring.

As for reloading configuration files, this can be done in two ways as well:

Evsieve reloads it configuration files when it receives a SIGHUP (note: evsieve currently quits upon receiving SIGHUP, so that would technically be a breaking change.)
Evsieve automatically reloads the configuration files when it notices the file has been modified.

Option two seems a bit more user-friendly. And also a bit more convoluted to implement.

If we go for option (2), do we want to enable configuration file reloading by default, or only if the user passes a special flag for it? If we want to make it default, we may want to wait until it is implemented before stabilizing the --config argument to avoid another backwards compatibility problem similar to requiring --input persist=reopen for backwards compatibility.

Some more thoughts

Since evsieve is probably running as root, and being able to edit evsieve's configuration files trivially allows you to escalate to root by using --hook exec-shell=my_evil_progam, should evsieve make sure that its configuration files are owned by either root or [the user evsieve is running as] and are not writable by anyone else? Similar to how OpenSSH refuses to work if it thinks the permissions of ~/.ssh aren't secure enough?

(In addition to checking the file permissions, we of course also neet to check the permission/ownership/sticky-bits on all parent directories of the configuration files, of course.)

If so, should evsieve also try to to check the file ACLs? (This is complicated, and an argument could be made that if the user is using ACLs, then they better know what they are doing so there is no need to babysit them.)

Some point to avoid a future breaking change: currently, evsieve --config PATH_TO_EMPTY_FILE immediately exits because it has nothing to do. If we want automatic configuration file reloading by default, then the presence of any configuration file should inhibit automatic exit.

I was going to write up some guidelines regarding its implementation, but then came to the conclusion that it would actually take less time to just implement it than to write out an implementation manual. Sorry in case you were looking forward to working on this.

No worries. Thank you actually, for spending time to implement a feature I want. (And of course, thanks for this great project in the first place.)

So far I've been trying to minimize the amount of dependencies I use, because evsieve is a program running as root and I don't want near-npm levels of untrusted dependencies in it. On one hand I don't want to include another dependency for a feature most people won't use, on the other hand, maybe writing such a lexer by hand is just too error-prone...

(I've been wondering for a long time whether I should get a bit looser with the dependencies. It is seriously slowing down some developments.)

Yeah I was surprised when cargo build pulled like 4 (or something) dependencies in total. Personally I'm quite a bit more liberal when it comes to dependencies, but I'm mostly writing application-level code so it's much different. Ultimately it's your project and your preference.

As of whether using an existing lexer is a good idea, honestly I'm kind of torn too. On one hand it's probably more reliable and supports many features that the user might expect (e.g. quoted strings, inline comments, escape sequences, etc.); on the other hand the feature-richness is maybe not actually desirable. It could be argued that we only want to support a very restrictive set of grammar. I guess it depends on whether the existing lexer allows feature selection (only enabling the minimum set that we need): if so I'm in favour; if not I do not have a strong opinion.

do we want to allow glob expansion inside the configuration files as well? That would make it possible to do something like --config /etc/evsieve.d/*? If we made that possible, in what order should the files in the glob expansion show up? (Sorted by codepoints, I suppose?)

That feels like overengineering to me, at least for the moment. You'd also have to consider whether glob matching on directories should be allowed (/etc/evsieve.d/*/*). And what about incomplete path segments (`/etc/evsieve/.conf)? Anyways, in my experience glob matching is always just a pain to work with. There are also potential security concerns. I don't like this idea.

While we're at it, do we expand environment variables in the configuration files, like --map $SOURCE_KEY $TARGET_KEY?

The main benefit of this that I can see is built-in templating support. However I think very few people are going to find a use for it though, especially considering Systemd already has unit templating in the form of "@ units" just in case.

This also ties into the wider lexer issue. There are so many design decisions to make regarding syntax and so many foot guns to dodge. Unless supported by a well-tested lexer crate, I think the benefits are greatly outweighed by the risks.

There two levels of possible support for reloading configuration files:

Reloading a configuration file changes the mappings that are applicable at runtime, but resets the internal state (e.g. all --toggles get put back to their default state.)

Reloading a configuration file changes the mappings that are applicable at runtime, and tries to retain the internal state as much as possible.

My use of evsieve is rather simple (just mappings), so I don't have an opinion either way.

As for reloading configuration files, this can be done in two ways as well:

Evsieve reloads it configuration files when it receives a SIGHUP (note: evsieve currently quits upon receiving SIGHUP, so that would technically be a breaking change.)

Evsieve automatically reloads the configuration files when it notices the file has been modified.

These two options aren't mutually exclusive no? I like the first option because it's simple. The second one though, I think if it is to be implemented, it should be opt-in with a flag. Less likelihood of surprising the user.

P.S. Speaking of flags, I really like your implementation using MetaArgument. It's really neat and extensible.

should evsieve make sure that its configuration files are owned by either root or [the user evsieve is running as] and are not writable by anyone else? Similar to how OpenSSH refuses to work if it thinks the permissions of ~/.ssh aren't secure enough?

(In addition to checking the file permissions, we of course also neet to check the permission/ownership/sticky-bits on all parent directories of the configuration files, of course.)

LGTM. I'd try to be lazy and find a crate for this though, although trying is the first step towards failure.

If so, should evsieve also try to to check the file ACLs? (This is complicated, and an argument could be made that if the user is using ACLs, then they better know what they are doing so there is no need to babysit them.)

I know nothing about ACLs, so I'll leave this up to you.

I ended up implementing a basic shell lexer myself.

I've come to the conclusion that allowing wildcards such as --input /dev/input/event* in configuration files is a bad idea because that would either mean that the configuration file can be interpreted differently if it is read at different times (which would be "fun" with configuration file reloading), or would require an even more complicated system to monitor when the list /dev/input/event* changes and update accordingly.

Furthermore, automatic reload of configuration files is indeed a bad idea after some more thought.

As for manual reload...

Currently evsieve exits upon receiving a SIGHUP signal. Traditionally this signal is sent when the connected terminal is closed and terminates the process by default. Nowadays some daemons have repurposed that signal to mean a request to reload the configuration files.

In evsieve I have intentionally taken the traditional interpretation of SIGHUP because it is actually useful in some cases: if you were to type some badly designed evsieve command that grabs your keyboard and leaves you unable to stop it with Ctrl+C (e.g. you grab your keyboard but forgot a corresponding --output argument), then you may still be able to use your mouse to close the terminal emulator, which sends a SIGHUP that causes evsieve to exit. I could also imagine some annoyance being caused by users closing the terminal emulator that was running evsieve only to find it is still running.

If evsieve were to stop interpreting SIGHUP as an exit signal, then it may become more difficult to close evsieve in some situations, to the annoyance of the user.

It would be possible for the behaviour to become "evsieve interprets SIGHUP as an exit signal, unless a --config argument is present, in which case it becomes a request to reload configuration files", but I don't like how the mere presence of a --config argument magically alters some seemingly unrelated behaviour.

I could make the --config argument accept a reloadable flag, where the presence of that flag makes evsieve stop treating SIGHUP as an exit signal and makes SIGHUP instead cause all configuration files tagged as reloadable to be reloaded. The disadvantage for this approach is the same as why I dislike the persist=reopen clause so much: most users want their configuration files to be reloadable since there is little reason for them not to be reloadable, but now they need to supply one additional flag for something that should be standard behaviour.

A third alternative would be using some other mechanism to inform evsieve to reload their configuration files?

It would be possible for the behaviour to become "evsieve interprets SIGHUP as an exit signal, unless a --config argument is present, in which case it becomes a request to reload configuration files", but I don't like how the mere presence of a --config argument magically alters some seemingly unrelated behaviour.

I agree.

I could make the --config argument accept a reloadable flag, where the presence of that flag makes evsieve stop treating SIGHUP as an exit signal and makes SIGHUP instead cause all configuration files tagged as reloadable to be reloaded. The disadvantage for this approach is the same as why I dislike the persist=reopen clause so much: most users want their configuration files to be reloadable since there is little reason for them not to be reloadable, but now they need to supply one additional flag for something that should be standard behaviour.

This makes sense. Also it may be undesirable to tightly couple the reload behaviour to the --config argument.

A third alternative would be using some other mechanism to inform evsieve to reload their configuration files?

There are signals SIGUSR1 and SIGUSR2 (user-defined signals 1&2) that we're free to do whatever with. See man 7 signal. And there are precedents for using them this way too, for example in /usr/lib/systemd/system/glustereventsd.service of glusterfs (Arch):

[Service]
Environment=PYTHONPATH=/usr/lib/python3.10/site-packages:$PYTHONPATH
Type=simple
ExecStart=/usr/bin/glustereventsd --pid-file /var/run/glustereventsd.pid
ExecReload=/bin/kill -SIGUSR2 $MAINPID
KillMode=control-group
PIDFile=/var/run/glustereventsd.pid

It doesn't break compatibility which is great too.

will this be implemented?

Configuration files by themselves are already implemented, but not stabilized yet (i.e. I might change the format of configuration files without warning). You can use them if you compile evsieve with the config feature enabled:

cargo build --features config

Reloading configuration files is something that has not been implemented yet. I do not have any specific timeline about when I was planning to work on that. (Okay, I did kinda forget that I wanted to stabilize configuration files by version 1.5. Thanks for reminding me.)

Reloading configuration files is something that has not been implemented yet

I'm fine with this feature not existing immediately (or even ever). All I care about is config files being implemented, so I don't need to create wrapper scripts just to pass options to evsieve.

KarsMulder / evsieve