NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.08k stars 14.08k forks source link

prometheus-alertmanager: files owned by alertmanager:alertmanager are reset to root:root #259435

Closed giorgiga closed 1 year ago

giorgiga commented 1 year ago

Describe the bug

Files configured to be owned by the alertmanager user sometimes get their owner reset to root after nixos-rebuild switch or reboot.

Unfortunately this does not seem to be systematic, except (in my tests) when first enabling the alertmanager service.

Steps To Reproduce

Steps to reproduce the behavior:

  1. With alertmanager enabled, create a file owned by the user alertmanager:
services.prometheus.alertmanager = {
  enable = true;
  configuration = { receivers = [{ name = "x"; }]; route.receiver = "x"; };
};
environment.etc."alertmanager.test" = {
  user = "alertmanager"; group = "alertmanager"; mode = "0400"; text = "x";
};
  1. nixos-rebuild switch
  2. check ls -l /etc/alertmanager.test

Expected behavior

$ ls -l /etc/alertmanager.test
-r-------- 1 alertmanager alertmanager 1 Oct  6 19:03 /etc/alertmanager.test

instead of

$ ls -l /etc/alertmanager.test
-r-------- 1 root root 1 Oct  6 19:03 /etc/alertmanager.test

Notify maintainers

@benley @fpletz @globin @Frostman

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

# nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.1.55, NixOS, 23.11 (Tapir), 23.11.20231006.fea7f3f`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.17.0`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
peterhoeg commented 1 year ago

There are 2 things going on here:

  1. first of all, alertmanager runs via dynamicuser, and this means the alertmanager user only exists when the alertmanager service is running.
  2. but there isn't much point in placing stuff directly in /etc for alertmanager. When you use the module, it will take the configuration you pass to the module and place it in /tmp in order to perform substitutions and then launch alertmanager with the output as the config file.

What exactly is the problem you are trying to solve here by placing stuff in /etc?

giorgiga commented 1 year ago

What exactly is the problem you are trying to solve here by placing stuff in /etc?

What I actually want to do is to give alertmanager access to an agenix secret that contains the smtp password - the /etc file was just because if I posted about agenix one could thing it's an agenix issue (I surely did at first).

peterhoeg commented 1 year ago

The issue as reported (file ownership is wrong) can be closed as that is expected behaviour, right?

What I actually want to do is to give alertmanager access to an agenix secret that contains the smtp password - the /etc file was just because if I posted about agenix one could thing it’s an agenix issue (I surely did at first).

The ugly way to do this:

  1. use LoadCredentials on the systemd unit to bring in the agenix secret (that part is not ugly)
  2. define your authentication config with placeholders: `smtp_auth_password = @.***_AUTH_PASSWORD@";'
  3. run an execstartpre script that replaces the placeholders with the secrets that LoadCredentials make avaiable
giorgiga commented 1 year ago

The issue as reported (file ownership is wrong) can be closed as that is expected behaviour, right?

I still think most users would expect to be able to grant file permissions to the user agenetmanager runs as, so at the very least they should be warned in the documentation (IDK where... maybe services.prometheus.alertmanager.enable?).

As for whether to implement some kind of support for this use case (which I assume is not that uncommon?) or classify this as invalid/wontfix (*)... that's your choice, obviously :)

(*) unfortunately I don't think I know enough nix to be of help yet, so volunteering to provide a PR is not an option

peterhoeg commented 1 year ago

The issue with file ownership has nothing to do with alertmanager or NixOS. This applies to any service run by systemd with `DynamicUser = true'. The user simply only exists when the service is running.

As for whether to implement some kind of support for this use case (which I assume is not that uncommon?) or classify this as invalid/wontfix (*)… that’s your choice, obviously :)

If by “this use case” you mean DynamicUser = true, then we could throw a warning if ownership is set to a non-existent user when users.mutableUsers = false'. WithmutableUsers = true', we obviously cannot know what users exist.

If you are referring to an easy way to substitute secrets in files, then there might be some prior-art in nixos. I just don’t know what exactly. The workaround I proposed is just that - a workaround.