illinois-cs241 / chainlink

Python module for running Docker containers in sequence
Other
4 stars 2 forks source link

Chainlink

Build Status Coverage Status License Python Versions

chainlink is a Python module for running Docker containers in sequence.

Installation

This module is not currently on PyPI. However, you can still install it via pip with

pip install git+https://github.com/illinois-cs241/chainlink

Usage

The class Chainlink is the only object exported by this module.

Constructor

__init__(self, stages, workdir="/tmp")

The Chainlink constructor takes a list of stages to chain and a workdir into which a temporary directory will be rooted. An example initialization with all available options is annotated below:

# a single-stage specification
stages = [{
  # container entrypoint (optional, defaults to image entrypoint)
  "entrypoint": ["ip", "link", "set", "lo", "up"],
  # container hostname (optional, defaults to 'container')
  "hostname": "somehost",
  # image to run (required, may be local or available on Docker Hub)
  "image": "alpine:3.5",
  # memory cap (optional, defaults to 2GB)
  "memory": "2g",
  # set of cpus to use, e.g. 0-3 for the first four or 0,2 for the first and third (optional, defaults to all)
  "cpuset_cpus": "0-7",
  # whether to allow networking capabilities (optional, defaults to True)
  "networking": True,
  # whether to switch on privileged mode (optional, defaults to False)
  "privileged": True,
  # the number of seconds until the container is killed (optional, defaults to 30)
  "timeout": 30,
  # enable saving the logs from this stage
  "logs": False,
  # container environment additions/overrides (optional, defaults to none)
  "env": {
    "VAR1": "1"
  }
}]
# use home directory as tempdir root
workdir = "/home/user/"

from chainlink import Chainlink
chain = Chainlink(stages, workdir=workdir)

Note that all images needed to run the specified stages are pulled in parallel during construction.

Run

def run(self, environ={})
async def run_async(self, environ={})

The Chainlink run function takes a base environment (environ) and executes each container specified by stages during construction in sequence. If a stage fails, then no subsequent stages will be run.

Unless it makes sense to have a base environment for all containers, environ can usually be left empty and specified in the env option of each stage instead.

The run function returns a list of object, an example of which is annotated below:

[{
  # the data returned by inspecting the State of the stage (container)
  # immediately before it was removed (see Docker SDK for details)
  "data": { ... },
  # whether or not the stage was killed due to a timeout
  "killed": False,
  # the stdout and stderr (with timestamps) for the stage
  "logs": {
    "stdout": b"bytestring",
    "stderr": b"bytestring"
  }
}]

Note that the returned list will have the same number of elements as there are stages, with element corresponding to the stage with the same index.

run_async is an async version of run.

Cross-Stage Communication

A single directory is mounted at /job in each container before it is run, and contents in this /job directory are persisted across stages.

This helps facilitate cross-stage communication, which becomes particularly useful if certain stages need to pass along results.

Private Registry Support

If you choose to set up a private registry, you should secure it. After it is secured, set up your credentials in the docker daemon using

docker login [registry.example.com]

where registry.example.com is the address of your registry. Note that you may want to configure how the daemon stores your credentials for security reasons.

After this, an image such as registry.example.com/alpine:latest can be pulled and used by chainlink.

Troubleshooting

Testing

To run integration tests, run:

sudo python3 -m unittest tests/integration/*.py

Note you should execute this command from the root of the project to ensure imports are correctly specified.