python-discord / snekbox

Easy, safe evaluation of arbitrary Python code
https://pythondiscord.com
MIT License
214 stars 39 forks source link

Discord Build Status Coverage Status License

snekbox

Python sandbox runners for executing code in isolation aka snekbox.

Supports a memory read/write file system within the sandbox, allowing text or binary files to be sent and returned.

A client sends Python code to a snekbox, the snekbox executes the code, and finally the results of the execution are returned to the client.

%%{init: { 'sequence': {'mirrorActors': false, 'messageFontWeight': 300, 'actorFontFamily': '-apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif' } } }%%
sequenceDiagram

actor Client
participant Snekbox
participant NsJail
participant Python as Python Subprocess

Client ->>+ Snekbox: HTTP POST
Snekbox ->>+ NsJail: Python code
NsJail ->>+ Python: Python code
Python -->>- NsJail: Execution result
NsJail -->>- Snekbox: Execution result
Snekbox -->>- Client: JSON response

The code is executed in a Python process that is launched through NsJail, which is responsible for sandboxing the Python process.

The output returned by snekbox is truncated at around 1 MB by default, but this can be configured.

HTTP REST API

Communication with snekbox is done over a HTTP REST API. The framework for the HTTP REST API is Falcon and the WSGI being used is Gunicorn. By default, the server is hosted on 0.0.0.0:8060 with two workers.

See snekapi.py and resources for API documentation.

Running snekbox

A Docker image is available in the GitHub Container Registry. A container can be started with the following command, which will also pull the image if it doesn't currently exist locally:

docker run --ipc=none --privileged -p 8060:8060 ghcr.io/python-discord/snekbox

To run it in the background, use the -d option. See the documentation on docker run for more information.

The above command will make the API accessible on the host via http://localhost:8060/. Currently, there's only one endpoint: http://localhost:8060/eval.

Python multi-version support

By default, the executable that runs within nsjail is defined by DEFAULT_EXECUTABLE_PATH at the top of nsjail.py. This can be overridden by specifying executable_path in the request body of calls to POST /eval or by setting the executable_path kwarg if calling NSJail.python3() directly.

Any executable that exists within the container is a valid value for executable_path. The main use case of this feature is currently to specify the version of Python to use.

Python versions currently available can be found in the Dockerfile by looking for build stages that match builder-py-*. These binaries are then copied into the base build stage further down.

Configuration

Configuration files can be edited directly. However, this requires rebuilding the image. Alternatively, a Docker volume or bind mounts can be used to override the configuration files at their default locations.

NsJail

The main features of the default configuration are:

NsJail is configured through snekbox.cfg. It contains the exact values for the items listed above. The configuration format is defined by a protobuf file which can be referred to for documentation. The command-line options of NsJail can also serve as documentation since they closely follow the config file format.

Memory File System

On each execution, the host will mount an instance-specific tmpfs drive, this is used as a limited read-write folder for the sandboxed code. There is no access to other files or directories on the host container beyond the other read-only mounted system folders. Instance file systems are isolated; it is not possible for sandboxed code to access another instance's writeable directory.

The following options for the memory file system are configurable as options in gunicorn.conf.py

The sandboxed code execution will start with a writeable working directory of home. By default, the output folder is also home. New files, and uploaded files with a newer last modified time, will be uploaded on completion.

Gunicorn

Gunicorn settings can be found in gunicorn.conf.py. In the default configuration, the worker count, the bind address, and the WSGI app URI are likely the only things of any interest. Since it uses the default synchronous workers, the worker count effectively determines how many concurrent code evaluations can be performed.

wsgi_app can be given arguments which are forwarded to the NsJail object. For example, wsgi_app = "snekbox:SnekAPI(max_output_size=2_000_000, read_chunk_size=20_000)".

Environment Variables

All environment variables have defaults and are therefore not required to be set.

Name Description
SNEKBOX_DEBUG Enable debug logging if set to a non-empty value.
SNEKBOX_SENTRY_DSN Data Source Name for Sentry. Sentry is disabled if left unset.

Third-party Packages

By default, the Python interpreter has no access to any packages besides the standard library. Even snekbox's own dependencies like Falcon and Gunicorn are not exposed.

To expose third-party Python packages during evaluation, install them to a custom user site:

docker exec snekbox /bin/sh -c \
    'PYTHONUSERBASE=/snekbox/user_base /snekbin/python/default/bin/python -m pip install --user numpy'

In the above command, snekbox is the name of the running container. The name may be different and can be checked with docker ps.

The packages will be installed to the user site within /snekbox/user_base. To persist the installed packages, a volume for the directory can be created with Docker. For an example, see docker-compose.yml.

Development Environment

See CONTRIBUTING.md.