NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.32k stars 14.29k forks source link

Unfirewalled Elixir services potentially vulnerable to RCE due to surprising shell scripts #130244

Open nh2 opened 3 years ago

nh2 commented 3 years ago

I suspect to have found a potentially security issue with Elixir based services that run on NixOS with networking.firewall.enable = false. The firewall is on by default, so the impact may be low. I also have only superficial understanding of Elixir and Erlang, so I'd like others to check if the below is true.

Background

Elixir runs on the Erlang VM. Erlang was made for distributed computing, so it will listen by default on all network interfaces for messages between Erlang nodes control messasges. These messages are capable remote code execution; they need to be protected by a password ("cookie") as described on https://erlang.org/doc/reference_manual/distributed.html#security.

(The fact that it listens on all interfaces is a bad default for security; users of multi-node Erlang should configure it explicitly and configure security accordingly. NixOS sets the default to localhost for most other networked services, see https://github.com/NixOS/nixpkgs/issues/100192#issuecomment-706633913.)

The Elixir Mix build tool is used (generally, and in NixOS) to build Elixir software. It has a function to generate a random COOKIE (https://hexdocs.pm/mix/Mix.Tasks.Release.html#module-options):

:cookie - a string representing the Erlang Distribution cookie. If this option is not set, a random cookie is written to the releases/COOKIE file when the first release is assembled.

releases/COOKIE does not exist because nix builds delete it, as publicly known, reproducibly generated passwords are ineffective:

https://github.com/NixOS/nixpkgs/blob/fe653f953d3a1f09ef6888f0ecc3490b2f2a0216/pkgs/development/beam-modules/mix-release.nix#L94-L100

Problem

Now to the issue.

The Mix build tool generates startup shell scripts for built Elixir programs, which read the releases/COOKIE file. If it does not exist, the startup shell script is supposed to fail with an error, in code:

set -e
# ...
export RELEASE_COOKIE="${RELEASE_COOKIE:-"$(cat "$RELEASE_ROOT/releases/COOKIE")"}"

The set -e is intended to make the script fail if the file cannot be read.

Now, the problem is that 99% of shell scripts are broken. Even regular shell users do not generally understand shell, because the language has myriad of exceptions, and poor defaults (fail-silently, keep-going-past-critical-errors).

set -e is ineffective here because of the 9th exception of when set -e does not work as expected:

The export keyword disables it, swallowing the exit code of the subshell.

The code

set -e
RELEASE_COOKIE="${RELEASE_COOKIE:-"$(cat "$RELEASE_ROOT/releases/COOKIE")"}"
export RELEASE_COOKIE

would have worked, but alas, doing export MYVAR, as around 99% of shell users write, is wrong for the intended semantics.

As a result, cat .../COOKIE will fail as the file does not exist, the $(...) shell substitution will produce the empty string, the shell script will continue past the intended-to-be-fatal error, and thus RELEASE_COOKIE will be the empty string.

The password to the Erlang inter-node remote code execution will be the empty string.

If the machine is not firewalled, this is bad.

Tasks


CC @happysalada @dlesl @LnL7 from nixpkgs Elixir history.

happysalada commented 3 years ago

Nice catch!

We could potentially force people to set a RELEASE_COOKIE, but since we don't have a way to set "proper" secrets then a nix assertion is probably not the best here. It seems to me a good solution would be to enable upstream applications to read the release cookie from a file. Then just pass the file location through an environment variable. The plausible guys are pretty open, and if you tell them it's a security issue, I'm sure they would be up to merging it.

Regarding configuring the listening address, I'm not sure how easy that will be. From what I know, every release starts epmd, which is the thing responsible for connecting the nodes and coordinating. It runs on port 4369 (out of memory). If that port is not open, then nodes won't be able to connect together (epmd is just marked as a zombie process). If you wanted, you could specify -no_epmd inside the https://github.com/plausible/analytics/blob/master/rel/vm.args.eex https://erlang.org/doc/man/erl.html and it wouldn't start epmd disabling the distributed aspect. That would belong upstream though.

nh2 commented 3 years ago

PR to improve plausible, and Elixir service, at #130297

nh2 commented 3 years ago

It seems to me a good solution would be to enable upstream applications to read the release cookie from a file.

This capability already exists for all Elixir Mix apps, in the code I quoted:

RELEASE_COOKIE="${RELEASE_COOKIE:-"$(cat "$RELEASE_ROOT/releases/COOKIE")"}"

This allows passing in the cookie as an environment variable RELEASE_COOKIE, which you could set based on a the contents of a file, e.g. using systemd's EnvironmentFile functionality.

I don't think that is (or should be) Plausible-specifiic.

We could potentially force people to set a RELEASE_COOKIE

I wouldn't do that universally, because as described in the other issue https://github.com/NixOS/nixpkgs/pull/130297#issuecomment-881066229:

Most people will just want to run Elixir apps single-machine, instead of on a multi-node Erlang cluster, so I think making it listen to localhost, and not bothering them with Erlang cluster cookies, is likely the better default in any case.

It would be good however to force people to set their own RELEASE_COOKIE if the NixOS modules notice that Erlang is not configured to listen to 127.0.0.1 only, that is, when multi-node Erlang is actually used.

Regarding configuring the listening address, I'm not sure how easy that will be.

It turned out to not be so difficult, see comment https://github.com/NixOS/nixpkgs/issues/130244#issuecomment-880916306 above (I only read your comment after I posted that).

happysalada commented 3 years ago

How about using RELEASE_DISTRIBUTION and defaulting to none. It seems to me the "clearer" way to disable the distributed functionality (https://hexdocs.pm/mix/Mix.Tasks.Release.html#module-environment-variables) Adding a LISTEN_IP is a little less transparent. Also, I'm not sure upstream would be interested in a LISTEN_IP configuration. (I could be wrong). Since this is essentially a nix problem (because we delete the RELEASE_COOKIE), it makes sense to me to solve it here in nixpkgs.

nh2 commented 3 years ago

Since this is essentially a nix problem (because we delete the RELEASE_COOKIE), it makes sense to me to solve it here in nixpkgs.

I just want to highlight that this isn't a nixpkgs-specific problem. Any packaging system has it (e.g. Debian, Ubuntu, binary releases, etc.), because passwords that need to be per-user cannot be packaged.

Adding a LISTEN_IP is a little less transparent.

@happysalada There's a bit of confusion: The LISTEN_IP I'm adding is for the Plausible HTTP web server. It has nothing to do with the Erlang platform ports, and is not controlled by their settings.

(It just happens to be that that port also belongs to the beam binary when listed in e.g. netstat.)

How about using RELEASE_DISTRIBUTION and defaulting to none. It seems to me the "clearer" way to disable the distributed functionality

That's a great idea for full disabling. Because with that, the distribution daemons don't have any ports listening on any interfaces in the first place. However, we likely still want to provide the erlang.vmListenAddress and erlang.epmdListenAddress settings for people who do want to run distributed Erlang, but in a secure fashion (e.g. making it listen only on VPN interfaces such as of wireguard or tinc, or on internal network interfaces.

I've verified the effect of RELEASE_DISTRIBUTION. In the below, 8000 is the Plausible HTTP web server port:

Default:

# netstat -antp | grep -E 'beam|epmd' | grep LISTEN   
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      304017/beam.smp
tcp        0      0 0.0.0.0:44841           0.0.0.0:*               LISTEN      304017/beam.smp
tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN      304056/epmd
tcp6       0      0 :::4369                 :::*                    LISTEN      304056/epmd

With inet_dist_use_interface and ERL_EPMD_ADDRESS set:

# netstat -antp | grep LISTEN | grep -E 'beam|epmd'   
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      303384/beam.smp
tcp        0      0 127.0.0.1:40777         0.0.0.0:*               LISTEN      303384/beam.smp
tcp        0      0 127.0.0.1:4369          0.0.0.0:*               LISTEN      303423/epmd
tcp6       0      0 ::1:4369                :::*                    LISTEN      303423/epmd

With RELEASE_DISTRIBUTION=none:

# netstat -antp | grep -E 'beam|epmd' | grep LISTEN 
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      304624/beam.smp

I have implemented your idea for plausible in commit https://github.com/NixOS/nixpkgs/pull/130297/commits/533a870620b32ef14400f3e07626bc19b3dcb1af.

nh2 commented 3 years ago

Subscribing @Ma27 from https://github.com/NixOS/nixpkgs/pull/130297#issuecomment-884593973