oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
244 stars 38 forks source link

Commands sled-agent runs through zlogin are subject to shell expansion #3216

Closed jgallagher closed 1 year ago

jgallagher commented 1 year ago

While investigating why dendrite didn't see a SIGUSR1 when it should've in #3199, @bnaecker noticed this oddity in sled-agent's logs:

02:15:10.006Z WARN SledAgent: Failed to activate switch: Failed to issue SMF command: Failed to do 'del config/address smf property value' by running command in zone: Error running command in zone 'oxz_switch': Command [/usr/sbin/zlogin oxz_switch /usr/sbin/svccfg -s svc:/system/illumos/dendrite delpropvalue config/address *] executed and failed with status: exit status: 1  stdout:   stderr: svccfg: Syntax error.
    sled_id = 96c409e3-a150-46de-9915-25b839ba2508

On its surface this doesn't make any sense:

/usr/sbin/svccfg -s svc:/system/illumos/dendrite delpropvalue config/address *

is perfectly valid syntax, and we use the exact same wrapper to delete properties when configuring MGS. Our SMF wrapper uses std::process::Command, so we should not (ominous sounds) be subject to shell expansion.

However, we are getting shell expansion by virtue of the zlogin. Reproducing is trivial outside of sled-agent:

BRM44220001 # /usr/sbin/zlogin oxz_switch ls -l
total 2
-rw-r--r--   1 root     root           0 Dec 27 18:00 bar
-rw-r--r--   1 root     root           0 Dec 27 18:00 foo
BRM44220001 # echo '*'
*
BRM44220001 # /usr/sbin/zlogin oxz_switch echo '*'
bar foo

This also explains the syntax error we see from svccfg, if there are any files in root's home directory:

BRM44220001 # /usr/sbin/zlogin oxz_switch /usr/sbin/svccfg -s svc:/system/illumos/dendrite delpropvalue config/address '*'
svccfg: Syntax error.
jgallagher commented 1 year ago

FWIW these work as expected, but yuck:

BRM44220001 # /usr/sbin/zlogin oxz_switch echo "'*'"
*
BRM44220001 # /usr/sbin/zlogin oxz_switch echo '\*'
*
bnaecker commented 1 year ago

@citrus-it recommended fetching all values and then deleting them in turn. Definitely safer to disallow globs, though more work.

bnaecker commented 1 year ago

I'm working on this now, and despite the ugliness, it's much more expedient to provide a method that escapes the wildcard character. E.g., that runs svccfg delpropvalue "'*'" in the zone. The alternative requires a bunch of regex munging, since the output of svccfg listprop or svcprop is not very regular. It seems really error-prone to try to parse out the values.

jclulow commented 1 year ago

I suspect another thing we could do is arrange for zone_enter() to be called in the forked child using the pre-exec hook for Command so that the child process we then exec would actually be run in the context of the zone. This is basically what zlogin is doing for us anyway, and would mean there was no shell interposition.

jclulow commented 1 year ago

I have put together a brief demonstration of zone_enter() to create a child process, which shows both that it is inside the zone and that its arguments are as they were passed to Command, not mangled by a shell, etc:

https://github.com/jclulow/junk-zones

 $ pfexec ./target/debug/zones
 * in gz...
    | in zone global
    | 11932  /usr/sbin/sshd
    |   17464  /usr/sbin/sshd -R
    |     17466  /usr/sbin/sshd -R
    |       17467  -bash
    |         18447  ./target/debug/zones
    |           18448  /usr/bin/bash -c echo in zone $(zonename); ptree $$; echo; par
    |             18450  ptree 18448
    |
    | 18448:    /usr/bin/bash -c echo in zone $(zonename); ptree $$; echo; pargs $$; echo -- on
    | argv[0]: /usr/bin/bash
    | argv[1]: -c
    | argv[2]: echo in zone $(zonename); ptree $$; echo; pargs $$; echo
    | argv[3]: --
    | argv[4]: one
    | argv[5]: two three
    | argv[6]: " four "
    |
 * in zone 1...
    | in zone bmat-test
    | 2243   zsched
    |   18452  /usr/bin/bash -c echo in zone $(zonename); ptree $$; echo; pargs $$; e
    |     18454  ptree 18452
    |
    | 18452:    /usr/bin/bash -c echo in zone $(zonename); ptree $$; echo; pargs $$; echo -- on
    | argv[0]: /usr/bin/bash
    | argv[1]: -c
    | argv[2]: echo in zone $(zonename); ptree $$; echo; pargs $$; echo
    | argv[3]: --
    | argv[4]: one
    | argv[5]: two three
    | argv[6]: " four "
    |

from

    let mut cmd = Command::new("/usr/bin/bash");
    cmd.env_clear();
    cmd.arg("-c");
    cmd.arg("echo in zone $(zonename); ptree $$; echo; pargs $$; echo");
    cmd.arg("--");
    cmd.arg("one");
    cmd.arg("two three");
    cmd.arg("\" four \"");

Let me know if I can help!

bnaecker commented 1 year ago

Thanks @jclulow that's a great place to start.

citrus-it commented 1 year ago

Robert mentioned that there is an environment variable to control the path to the door that we could already use to work around this quickly:

BRM42220014 # LIBSCF_DOORPATH=/zone/oxz_switch/root/etc/svc/volatile/repository_door \
    svccfg -s dendrite listprop config/address
config/address  astring  "[fd00:1122:3344:103::2]:12224" "[::1]:12224"
citrus-it commented 1 year ago

I have an illumos change up for review that adds -z <zone> to svccfg - https://code.illumos.org/c/illumos-gate/+/2888 It should hopefully be available in stlouis soon.

jclulow commented 1 year ago

Does the code for using an alternative door path correctly avoid following a potentially malicious symlink inside the zone?

citrus-it commented 1 year ago

Does the code for using an alternative door path correctly avoid following a potentially malicious symlink inside the zone?

Nope, and that is https://www.illumos.org/issues/15727