We (especially @jameswhite and I) have need of an efficient, (dependency-) lightweight single-purpose tool to facilitate "branch deploys", and "branch diffs" for continuing to develop "on-system" software.
This differs from the various ways and tools to deploy (micro-)services and web apps to cloud platforms, kubernetes, etc., for which there are many CI/CD tools and processes.
"On-system?"
By "on-system" software I mean that we are often managing the operating-system level files on a host machine. For instance, files in /etc, configurations of things like LDAP servers, Nagios monitoring, TFTP/PXE configurations, etc.
"Host?"
By "host" we mean a computer, which could be bare metal, a virtual machine (for instance in a VMWare instantiation), a container, a remote cloud-hosted instance. We expect the host to have a running OS and kernel, most often running Linux, but configuration may be minimal.
"source host?", "target host?"
In the timeline below we use these terms to refer to how this deployment tool operates. The "source host" is either a CI instance or some other host which is initiating a deployment, which triggers a branch deployment and/or branch diff onto a "target host". The "target host" is the host where our changes are being deployed (or proposed).
"bd?"
The working name for the command-line tool which is invoked to do branch deployments is bd. This could change depending on whims.
"branch deploys?", "branch diffs?"
[Old man voice:] Back when we worked at GitHub, we deployed software (web applications, supporting tooling, host configurations, eventually network configurations, etc.) using a specific process:
Develop changes via Pull Request on a new git branch
As changes are pushed to GitHub, CI runs to validate the changes
When changes need testing, "branch deploy" them (deploy that git branch of code to either a "branch lab" staging environment, or, eventually) to the production host(s) in question
If this deployment proves workable (via various testing and telemetry means) then the Pull Request is merged, and the deployment becomes the new production "mainline" deployment
This is often colloquially referred to as the "GitHub Flow", especially if these operations are undertaken via ChatOps.
Over time, for changes which were less "web application software" and more likely to be systems level software (OS configurations, authorization configurations, network changes), the "branch deploy" phase was often preceded by:
Report on the proposed changes, on real hosts, for this branch. This would take the form of a diff, hence "branch diff".
This process, taken together with the technique of separating the deployment and setup of a systems level tool (LDAP, puppet, nagios, vault, or systems such as Entitlements), from the data contents of that tool -- each of which can have its own branch deployment lifecycle -- is sometimes colloquially called the "KP Flow", after Kevin Paulisse who refined and spread this pattern during his time at GitHub.
Development constraints
The source host needs to support the language environment (ruby in this case) for the bd command-line tool.
The target host for a deployment needs a modern-ish shell (bash, zsh, dash, etc.) to run the deployment commands; git to clone a repository, andrsync to do tree diffing and deployment. It does not need to support the (ruby) language environment.
We can use whatever tools and libraries to test and develop, but the software that actually runs on the source host to trigger deployments should use the ruby standard library and nothing else.
Testing should be full integration testing if at all possible. That is, no unit testing; definitely no mocking/stubbing tests. We should be able to use containerization to support fully sandbox real deployments during testing.
It is not uncommon for configurations for multiple hosts to be managed from a single repository, so support specifying a path into the repository to act as a base path for the files deployed for a specific host.
One of the prime targets for system management is the /etc path, and typically there are specific subsets of files being specifically managed, while other files are not under management. This points to being able to specify inclusion paths for managed files, probably with a configuration file available in the repository.
I did something like this before, over a decade ago, but experience, platforms, tools, taste, etc., have all evolved. One of the same constraints applies: deployments should be absurdly fast. This probably means that we build a single shell command to handle the entire deployment and send it in one shot.
Timeline
[x] get basic development tooling in place
[x] write a README for the CLI bd tool usage
[x] BDD test-drive the options processing for the CLI tool
[x] get container-sandboxed local CI working
[x] get container-sandboxed GitHub Actions CI working
[x] make bare git repositories to use as fixtures in deployment testing
[x] make a place for the bare git repos (spec/repo-fixtures?)
[x] make a place for scripts that can build those repositories (spec/repo-builders)
[x] make a script that can read all repo-builder scripts
[x] foreach name:
[x] create a git repo
[x] run the builder script against that git repo
[x] make sure there is an empty spec/repo-fixtures/ path
[x] clone --bare into that repo-fixtures path
[ ] test drive deployment
[ ] use bd to deploy a known repo configuration to the target
[ ] use bd to deploy a branch to the same target
[ ] ... verify the contents of the target path
[ ] every managed file from the repo should be present
[ ] every non-managed file should still be present
[ ] every managed file not in the repo should be absent
[ ] do similar for diff deployments
[ ] verify diff as a list of files added, changed, or removed
[ ] verify diff as line-level file changes
[ ] tests for configuration for managed files
[ ] tests for ssh options
[ ] tests for local deployments
[ ] tests for varied in-repo paths
[ ] tests for varied target paths
[ ] tests for varied config file paths
[ ] add support for running hook scripts before and after actually deploying files
[ ] pre-deploy hook script could run from repo
[ ] pre-deploy hook script could run from an absolute path
[ ] post-deploy script runs from an absolute path
[ ] update main README with this information, and usage information
[ ] create a rubygem for this and publish it, as the primary way this is used
[ ] review the libs used by bin/bd and cut any requires that would pollute the gem dependency
Goal
We (especially @jameswhite and I) have need of an efficient, (dependency-) lightweight single-purpose tool to facilitate "branch deploys", and "branch diffs" for continuing to develop "on-system" software.
This differs from the various ways and tools to deploy (micro-)services and web apps to cloud platforms, kubernetes, etc., for which there are many CI/CD tools and processes.
"On-system?"
By "on-system" software I mean that we are often managing the operating-system level files on a host machine. For instance, files in
/etc
, configurations of things like LDAP servers, Nagios monitoring, TFTP/PXE configurations, etc."Host?"
By "host" we mean a computer, which could be bare metal, a virtual machine (for instance in a VMWare instantiation), a container, a remote cloud-hosted instance. We expect the host to have a running OS and kernel, most often running Linux, but configuration may be minimal.
"source host?", "target host?"
In the timeline below we use these terms to refer to how this deployment tool operates. The "source host" is either a CI instance or some other host which is initiating a deployment, which triggers a branch deployment and/or branch diff onto a "target host". The "target host" is the host where our changes are being deployed (or proposed).
"bd?"
The working name for the command-line tool which is invoked to do branch deployments is
bd
. This could change depending on whims."branch deploys?", "branch diffs?"
[Old man voice:] Back when we worked at GitHub, we deployed software (web applications, supporting tooling, host configurations, eventually network configurations, etc.) using a specific process:
This is often colloquially referred to as the "GitHub Flow", especially if these operations are undertaken via ChatOps.
Over time, for changes which were less "web application software" and more likely to be systems level software (OS configurations, authorization configurations, network changes), the "branch deploy" phase was often preceded by:
This process, taken together with the technique of separating the deployment and setup of a systems level tool (LDAP, puppet, nagios, vault, or systems such as Entitlements), from the data contents of that tool -- each of which can have its own branch deployment lifecycle -- is sometimes colloquially called the "KP Flow", after Kevin Paulisse who refined and spread this pattern during his time at GitHub.
Development constraints
bd
command-line tool.git
to clone a repository, andrsync
to do tree diffing and deployment. It does not need to support the (ruby) language environment./etc
path, and typically there are specific subsets of files being specifically managed, while other files are not under management. This points to being able to specify inclusion paths for managed files, probably with a configuration file available in the repository.Timeline
bd
tool usagebd
to deploy a known repo configuration to the targetbd
to deploy a branch to the same targetbin/bd
and cut any requires that would pollute the gem dependency