[Examples] Add sanitize-args-then-execve example for Python

dimakuv commented 3 years ago

Description of the problem

Sometimes people don't want to hard-code command-line arguments and environment variables via loader.env_src_file and loader.argv_src_file. What they want is to allow arbitrary arguments/envvars but have a way to sanitize them.

With Graphene, we can do a simple trick to achieve this. We create a "premain" tiny program that does the following:

int main(int argc, char** argv) {
    <sanitize argv: add definitely-needed arguments and put out unknown arguments>
    <sanitize environ: add definitely-needed envvars and put out unknown envvars>
    execve("actual-program-to-run", <sanitized argv>, <sanitized environ>);
}

Then in the Graphene manifest file, the libos.entrypoint points to this tiny program.

Marblerun uses this trick exactly: https://github.com/edgelesssys/marblerun/blob/master/cmd/premain-graphene/main.go

This trick is also (partially) described e.g. here:

Solution

We can add an example to say Python-simple where we do this "premain" sanitization. We would remove all envvars except the Python-related ones. We would also leave only Python command line options like -c and remove all others. We would also forcibly add -B (to avoid *.pyc files).

This will become a nice example we can point people to when they want to do arbitrary sanitization of args/envvars.

mkow commented 3 years ago

In general, I believe argv and envp sanitization can only be reasonably secure if done using whitelisting. And overall I'd discourage our users from sanitizing the arguments, they should rather just provide them from a trusted source (using protected argv).

We would also leave only Python command line options like -c

Doesn't this option allow arbitrary code execution?

dimakuv commented 3 years ago

And overall I'd discourage our users from sanitizing the arguments, they should rather just provide them from a trusted source (using protected argv).

Sometimes there are legit reasons to use such a "sanitization" approach. Sometimes you have an app that can e.g. show some non-private info based on an argument --info, and there are no other allowed arguments. So instead of having two argv files, you can just do this trick.

We would also leave only Python command line options like -c

Doesn't this option allow arbitrary code execution?

Oops, I meant python3 scripts/helloworld.py | scripts/fibonacci.py.

mkow commented 3 years ago

And overall I'd discourage our users from sanitizing the arguments, they should rather just provide them from a trusted source (using protected argv).

Sometimes there are legit reasons to use such a "sanitization" approach. Sometimes you have an app that can e.g. show some non-private info based on an argument --info, and there are no other allowed arguments. So instead of having two argv files, you can just do this trick.

Yup. But I still would be careful suggesting this solution to users and only recommend it if they actually can't use other, safer approaches.

Oops, I meant python3 scripts/helloworld.py | scripts/fibonacci.py.

This is still very hard to secure. The path may point to some allowed directory, or even worse, to /dev/stdin or similar.

gramineproject / graphene

[Examples] Add sanitize-args-then-execve example for Python #2347

Description of the problem

Solution