oxidecomputer / buildomat

a software build labour-saving device
Mozilla Public License 2.0
53 stars 2 forks source link

hammer logo
B U I L D O M A T
a software build labour-saving device


Buildomat manages the provisioning of ephemeral UNIX systems (e.g., instances in AWS EC2) on which to run software builds. It logs job output, collects build artefacts, and reports status. The system integrates with GitHub through the Checks API, to allow build jobs to be triggered by pushes and pull requests.

Components

Buildomat is made up of a variety of crates, loosely grouped into areas of related functionality:

$ cargo xtask crates
buildomat                    /bin
buildomat-agent              /agent
buildomat-bunyan             /bunyan
buildomat-client             /client
buildomat-common             /common
buildomat-database           /database
buildomat-server             /server
buildomat-types              /types

buildomat-factory-aws        /factory/aws
buildomat-factory-lab        /factory/lab

buildomat-github-common      /github/common
buildomat-github-database    /github/database
buildomat-github-dbtool      /github/dbtool
buildomat-github-ghtool      /github/ghtool
buildomat-github-server      /github/server

xtask                        /xtask

Buildomat Core

The buildomat core is responsible for authenticating users and remote services, for managing build systems, and for running jobs and collecting output.

Server (buildomat-server, in server/)

The core buildomat API server. Coordinates the creation, tracking, and destruction of workers in which to execute jobs. This component sits at the centre of the system and is used by the GitHub integration server, the client command, the agent running within each worker for control of the job, and any factories.

Client Command (buildomat, in bin/)

A client tool that uses the client library to interface with and manipulate the core server. The tool has both administrative and user-level functions, expressed in a relatively regular hierarchy of commands; e.g., buildomat job run or buildomat user ls.

$ ./target/release/buildomat
Usage: buildomat [OPTS] COMMAND [ARGS...]

Commands:
    info                get information about server and user account
    control             server control functions
    job                 job management
    user                user management

Options:
        --help          usage information
    -p, --profile PROFILE
                        authentication and server profile

ERROR: choose a command

Client Library (buildomat-client, in client/)

A HTTP client library for accessing the core buildomat server. This client is generated at build time by progenitor, an OpenAPI client generator.

The client is generated based an OpenAPI document managed in the repository and generated by Dropshot based on the implementation of the server and then checked in to the repository. If you make changes to the API exposed by the core server, you will need to regenerate the document, client/openapi.json, using:

$ cargo xtask openapi

Agent (buildomat-agent, in agent/)

A process that is injected into an ephemeral AWS EC2 instance to allow the buildomat core server to take control and run jobs. This process receives single-use credentials at provisioning time from the core server, and connects out to receive instructions. The agent does not require a public IP, or any direct inbound connectivity, to allow agents to run inside remote NAT environments.

Factories

Buildomat jobs are specified to execute within a particular target environment. Concrete instances of those target environments (known as workers) are created, managed, and destroyed by factories. Factories are long-lived server processes that connect to the core API and offer to construct workers as needed. When a worker has finished executing the job, or when requested by an operator, the factory is also responsible for freeing any resources that were in use by the worker.

AWS Factory (buildomat-factory-aws in factory/aws/)

The AWS factory creates ephemeral AWS instances that are used to run one job and are then destroyed. The factory arranges for the agent to be installed and start automatically in each instance that is created. The factory is responsible for ensuring no stale resources are left behind, and for enforcing a cap on the concurrent use of resources at AWS. Each target provided by an AWS factory can support a different instance type (i.e., CPU and RAM capacity), a different image (AMI), and a different root disk size.

Lab Factory (buildomat-factory-lab in factory/lab/)

The lab factory uses IPMI to exert control over a set of physical lab systems. When a worker is required, a lab system is booted from a ramdisk and the agent is started, just as it would be for an AWS instance. From that point on, operation is quite similar to AWS instances: the agent communicates directly with the core API. When tearing down a lab worker, the machine is rebooted (again via IPMI) to clear out the prior ramdisk state. Each target provided by a lab factory can boot from a different ramdisk image stored on a local server.

GitHub Integration (formerly known as Wollongong)

The GitHub-specific portion of the buildomat suite sits in front of the core buildomat service. It is responsible for receiving and processing notifications of new commits and pull requests on GitHub, starting any configured build jobs, and reporting the results so that they are visible through the GitHub user interface.

Server (buildomat-github-server, in github/server/)

This server acts as a GitHub App. It is responsible for processing incoming GitHub webhooks that notify the system about commits and pull requests in authorised repositories. In addition to relaying jobs between GitHub and the buildomat core, this service provides an additional HTML presentation of job state (e.g., detailed logs) and access to any artefacts that jobs produce. This server keeps state required to manage the interaction with GitHub, but does not store job data; requests for logs or artefacts are proxied back to the core server.

Database Tool (buildomat-github-dbtool, in github/dbtool/)

This tool can be used to inspect the database state kept by the GitHub integration as it tracks GitHub pull requests and commits. Unlike the core client tool, this program directly interacts with a local SQLite database.

$ buildomat-github-dbtool
Usage: buildomat-github-dbtool COMMAND [ARGS...]

Commands:
    delivery (del)      webhook deliveries
    repository (repo)   GitHub repositories
    check               GitHub checks

Options:
    --help              usage information

ERROR: choose a command

Of particular note, the tool is useful for inspecting and replaying received webhook events; e.g.,

$ buildomat-github-dbtool del ls
SEQ   ACK RECVTIME             EVENT          ACTION
0     1   2021-10-05T01:58:32Z ping           -
1     1   2021-10-05T02:25:33Z installation   created
2     1   2021-10-05T02:26:53Z push
3     1   2021-10-05T02:26:53Z check_suite    requested
4     1   2021-10-05T02:26:56Z check_suite    completed
5     1   2021-10-05T02:26:56Z check_run      completed
6     1   2021-10-05T02:26:56Z check_run      created
7     1   2021-10-05T02:26:56Z check_run      created
8     1   2021-10-05T02:26:57Z check_run      created
...

The buildomat-github-dbtool del unack SEQ command can be used to trigger the reprocessing of an invididual webhook message.

Per-repository Configuration

Buildomat works as a GitHub App, which is generally "installed" at the level of an Organisation. Installing the App allows buildomat to receive notifications about events, such as git pushes and pull requests, from all repositories (public and private) within the organisation. In order to avoid accidents, buildomat requires that the service be explicitly configured for a repository before it will take any actions.

Per-repository configuration is achieved by creating a file in the default branch (e.g., main) of the repository in question, named .github/buildomat/config.toml. This file is written in TOML, with a handful of simple values. Supported properties in this file include:

Note that buildomat will only ever read this configuration file from the most recent commit in the default branch of the repository, not from the contents of another branch or pull request. This is of particular importance for security-sensitive properties like org_only, where the policy set by users with full write access to the repository must not be overridden by changes from potentially untrusted users. If a pull request with a malicious policy change is merged, it will then be in the default branch and active for subsequent pull requests; maintainers must carefully review pull requests that change this file.

Specifying Jobs

Once you have configured buildomat at the repository level, you can specify some number of jobs to execute automatically in response to pushes and pull requests. While per-repository configuration is read from the default branch, jobs are read from the commit under test.

Jobs are specified as bash programs with some configuration directives embedded in comments. These job files must be named .github/buildomat/jobs/*.sh. Unexpected additional files in .github/buildomat/jobs will result in an error.

Job files should begin with an interpreter line, followed by TOML-formatted configuration prefixed with #: so that they will be identified as configuration by buildomat, but ignored by the shell. For example, a minimal job that would just execute uname -a:

#!/bin/bash
#:
#: name = "build"
#: variety = "basic"
#:
uname -a

The minimum set of properties that must always appear in the TOML frontmatter is:

These properties are optional, but not variety-specific:

The rest of the configuration is variety-specific.

Variety: Basic

Each basic variety job (selected by specifying variety = "basic" in the frontmatter) takes a single bash program and runs it in an ephemeral environment. The composition of that environment, such as compute and memory capacity or the availability of specific toolchains and other software, depends on the target option.

Basic variety jobs can produce output files (see the configuration options output_rules and publish). They can also depend on the successful completion of other jobs, gaining access to any output files from the upstream job (see the dependencies option). Jobs are generally executed in parallel, unless they are waiting for a dependency or for capacity to become available.

Execution Environment

By default, an ephemeral system (generally a virtual machine) will be provisioned for each job. The system will be discarded at the end of the job, so no detritus is left behind. Once the environment is provisioned, the bash program in the job file is executed as-is.

Jobs are executed as an unprivileged user, build, with home directory /home/build. If required, this user is able to escalate to root privileges through the use of pfexec(1). Systems that do not have a native pfexec will be furnished with a compatible wrapper around a native escalation facility, to ease the construction of cross-platform jobs.

By default, the working directory for the job is based on the name of the repository; e.g., for https://github.com/oxidecomputer/buildomat, the working directory would be /work/oxidecomputer/buildomat. The system will arrange for the repository to be cloned at that location with the commit under test checked out. A simple job could directly invoke some build tool like gmake or cargo build, and the build would occur at the root of the clone. The skip_clone configuration option can disable this behaviour.

Most targets provide toolchains from common metapackages like build-essential; e.g., gmake and gcc. If a Rust toolchain is required, one can be requested through the rust_toolchain configuration option. This will be installed using rustup.

Environment Variables

While the complete set of environment variables is generally target-specific, the common minimum for all targets includes:

Available Commands

Cross-platform shell programming can be challenging due to differences between different operating systems. To make this a little easier, we ensure that each buildomat target can provide a basic suite of tools that are helpful in constructing succint jobs:

Configuration

Configuration properties supported for basic jobs include:

Licence

Unless otherwise noted, all components are licenced under the Mozilla Public License Version 2.0.