Execution Model - Githubissues

bluejekyll commented 4 years ago

What should the execution model for supervised processes be?

Some minimal features

All environment variables are white-list only, with some default set that is always passed on, list TBD.
We need a common platform sandboxing technique, chroot would be a minimum, with Linux cgroups, BSD jails, macOS sandboxes, Windows Containers, being a good target.
- will we want to allow for any configuration of this?
- should we support something like apparmor (is there an equivalent on Windows and macOS?)
common management strategies for auto restart, capture of stdout and stderr for logs, etc. would all be MVP requirements.

What is the process model we should follow? To answer this question let's see what we want to be true

1) It should not be possible for a child process to gain access to a parent process 2) A parent process should not be able to do more than start, restart, or shutdown the child (i.e. shouldn't have any way to access to the child's memory)

bluejekyll commented 4 years ago

To answer 1 and 2 I think a fork exec model where the supervisor (process monitoring the actual executable) uses some form of IPC (platform agnostic) to talk to the vermilion init process is the best way to manage this (of course, need to be careful here).

This can just be a fork of the primary process, possibly... and still a single process (for the simple case where there is only one process being monitored by vermilion).

tarcieri commented 4 years ago

Perhaps it would be worth defining a security model as a prerequisite of this

bluejekyll commented 4 years ago

Agree, though right now I think it's difficult to consider what that should be exactly as I don't know all the features we want.

I think it's clear that we want a certain amount of separation, eg, popping a child process should never expose the parent process. no-read-down and no-write-up, is basically what I'm going for. Do you have something more formal you'd like to propose?

tarcieri commented 4 years ago

POSIX heavily relies on ambient authority for access control, and this is generally reflected in init systems, which are effectively a "god mode" process running as root.

An open question is: can the process running as root be reduced to as little authority as possible? What are its responsibilities? How is access controlled?

This approach can be thought of as similar to a microkernel architecture, which reduces the responsibilities of the kernel as much as possible, and farms them out to lower authority processes. Systems like Google's Fuchsia and seL4 use an object capability (a.k.a. OCap) model to restrict that authority.

In an init system, that'd be reflected by something like the minimum viable parent process running as root, which initializes a set of least authority agents running under separate UIDs which are each conferred authority to perform particular tasks of the init system (e.g. loading the policy from configuration files and making access control decisions based on it, receiving events system events and determining a course of action, systemd-style miscellaneous junk drawer functionality or perhaps things that might be actually interesting like secrets management).

Each of these lower-authority processes could be given "ocaps" to invoke some restricted subset of the functionality the root supervisor process is capable of, such as requesting a process be spawned with a particular UID/GID.

ratmice commented 4 years ago

I just wanted to chime in that i am not sure how to attempt to do object capability style boot loading in a POSIX environment, the kernel handoff mechanism is typically very different in an object capability system.

I'm not familiar with Fuchsia in particular, seL4's root process in particular would seem to have more in common with initrd (Perhaps without the filesystem stuff, and adding in a process)

But this is more apparent by looking at systems similar to seL4 which have built in persistence and checkpointing. I've linked below a document describing one such system capros, and will be referring to it.

capros boot

In the Overview section, Step 3 here reflects the seL4 root process, and there is no built in mechanism for step 4. The root process would in general be in charge of step 4. By the time init runs in a POSIX system however we are already at step 4, having mounted the filesystems.

That is to say, in Linux at least (I haven't looked but don't imagine POSIX would specify internal pre-boot behavior like initrd), initrd is the closest thing resembling the seL4 root process, or what capros above calls "the IPL process", which stands for "Initial Process Loader" process.

I don't know if this helps, but I don't really have any ideas on how one would go about reconciling the differences in the kernel hand-off.

bluejekyll commented 4 years ago

Yes, I think we're on the same page. I think one of the challenges we face, and maybe we should decide this upfront, is how cross-platform we want this to be.

an object capability (a.k.a. OCap) model to restrict that authority.

I like this, though I don't want to expose this (I don't think) as configurable options. i.e. we want to enforce the boundaries. I do think much of this derives from a no-read-down, no-write-up policy. For reference, the Biba Model. @ratmice, I share your concern about trying to be too overambitious and not be able to target certain OSes. I'd like to support all major OSes.

Each of these lower-authority processes could be given "ocaps" to invoke some restricted subset of the functionality the root supervisor process is capable of, such as requesting a process be spawned with a particular UID/GID.

I was thinking that the supervisor process would be spawned with the uid/gid of the target process, names to be chosen. I'm thinking of four entities at this point:

1) Leader (builds graph and launches supervisors, privileged vermilion to issue commands to Launcher) 2) Launcher (launches all processes, only listens to Master, also launches Master?) 3) Supervisors (execs target process, manages runtime, w/target uid/gid) a) Managed process (sibling to supervisor) 4) Logger (usermode, only accepts input, runs with only privileges to write to log dir) 5) IPC (not sure what this is right now, but I'm thinking we want to rely on kernel provided functions, maybe?)

I can work up a drawing if it helps our conversation. For purposes of initial MVP, I think we'd start by building the supervisor.

bluejekyll commented 4 years ago

Actually, we might be able to introduce an additional component, let's call it the Launcher, that is only responsible for launching the Supervisors and setting uid/gid. The Master would be the only thing capable of issuing that command. (adding to above comment)

tarcieri commented 4 years ago

@bluejekyll sounds great, those are exactly the sort of capability/security boundary separations I was hoping to see

bluejekyll commented 4 years ago

I've been researching what we have available in Rust at the moment. I have a goal of making this all async/io, as I think that would make this stuff fairly lightlweight, but there is a lot less existing work in this area for that as far as I can tell with Tokio, et al. I'm still trying to figure out MVP tools necessary. Though, it looks like tokio does have pipe support, still researching: https://doc.rust-lang.org/nightly/std/process/struct.Stdio.html#method.piped

This is a good post on file descriptor passing. We need to consider how FD's will be passed between processes. This is a good post on an old strategy with sendmsg and recvmsg to be able to pass file descriptors to processes after forking: https://sumitomohiko.wordpress.com/2015/09/24/file-descriptor-passing-with-sendmsg2-and-recvmsg2-over-unix-domain-socket/

At a minimum, I think this is necessary for passing stderr/out to a logger from the monitored process.

bluejekyll commented 4 years ago

Ok, early proof of concept: #12

bluejekyll commented 4 years ago

For windows support (non-primary goal) we will need to evaluate how to pass file “handles” between processes: https://lackingrhoticity.blogspot.com/2015/05/passing-fds-handles-between-processes.html?m=1

tarcieri commented 4 years ago

My honest thought on Windows support is: don't. There's a lot of things that won't be possible with a cross Windows/POSIX abstraction.

bluejekyll commented 4 years ago

Is there anything that jumps to mind off the bat? At the moment I can see at least file_handle and file_descriptor hand-off would be roughly of equivalent capabilities.

I can imagine it will get more hairy on uid, gid, and permissions.

tarcieri commented 4 years ago

POXIX user and file permissions and things like ulimits immediately jump to mind

bluejekyll / vermilionrc

Execution Model #2

Some minimal features

What is the process model we should follow? To answer this question let's see what we want to be true