delphix / linux-pkg

Framework to build custom packages for the Delphix Appliance
Apache License 2.0
4 stars 31 forks source link
owner-platform

Linux Package Framework

This framework is used for building customized third-party packages and public Delphix packages for the Ubuntu-based Delphix Appliance. It also has the functionality to automatically sync third-party packages with the upstream projects.

Table of Contents

  1. System Requirements
  2. Getting Started
  3. Project Summary
  4. Scripts
  5. Environment Variables
  6. Package Definition
  7. Adding New Packages
  8. Testing your changes
  9. Package Lists
  10. Versions and Branches
  11. Contributing
  12. Statement of Support
  13. License

System Requirements

This framework is intended to be run on an Ubuntu 20.04 system with some basic developer packages installed, such as git, and passwordless sudo enabled. Note that it will automatically install various build-dependencies on the system, so as a safety precaution it is currently restricted to only run on an AWS instance to prevent developers accidentally running it on their personal machines. To bypass the safety check, you can run the following command before running any script:

export DISABLE_SYSTEM_CHECK=true

Getting Started

This quick tutorial shows how to build the packages managed by this framework.

Step 1. Create build VM

You need a system that meets the requirements above. For Delphix developers, you should clone the dlpx-internal-buildserver-develop group on DCoA.

Step 2. Clone this repository

Clone this repository on the build VM. In order to clone this repository to the build VM, you may need to specify a personal access token for authentication and authorization.

git clone https://<token>@github.com/delphix/linux-pkg.git

Step 3. Build a package

We can now build an arbitrary package. Any package in the packages directory would do. Let's pick cloud-init as an example:

./buildpkg.sh cloud-init

Build artifacts will be stored in directory packages/cloud-init/tmp/artifacts/.

Project Summary

There are two main tasks that are performed by this framework: building packages and keeping each package up-to-date with its upstream project by updating the appropriate git branches.

Building packages

This task is relatively straight forward. What linux-pkg calls a "package" is really a project (usually a git project) that has a build recipe and that produces one or more debian packages and some other metadata files.

See Scripts > updatelist.sh below.

Updating third-party packages

The idea behind this task is to reduce the amount of effort required to maintain third-party packages and keep them up-to-date. Note that this task does not apply to packages created and maintained by Delphix, but only to third-party packages that Delphix modifies. Instead of following a more conventional approach of using tarballs and patches with all its drawbacks, we've decided to leverage the advantages offered by revision control. As such, we've adopted a well defined branching model for each third-party package.

First of all, we have a Delphix repository on github for each third-party package that we build. Each repository has at least 2 branches: develop and upstreams/develop. The develop branch of the package is the one we build, and contains Delphix changes. The upstreams/develop branch is used to track the upstream version of the package. For packages that are not provided by Ubuntu but are available on git, the upstreams/develop branch usually just tracks the develop branch of the project. For packages that are provided by Ubuntu, the upstreams/develop branch instead tracks the source package that is maintained by Ubuntu (i.e. the branch contains the files obtained from apt-get source <source-package>). This offers the advantage of using a version of the package tuned to work with our Ubuntu distribution.

When updating a package, we first check if the upstreams/develop branch is up-to-date, by fetching the latest version of the upstream git repository or the Ubuntu source package. If changes are detected, we update the upstreams/develop branch and push the changes to GitHub.

The second step is to check if the develop branch is up-to-date with upstreams/develop. If it is already up-to-date, then we are done. If not, then we attempt merging upstreams/develop into develop.

If the merge is successful, then we push the changes to a staging branch on GitHub, called projects/auto-update/develop/merging. The intent is for a different system to fetch those changes, build them, and then launch tests.

See Scripts > sync-with-upstream.sh below.

Once the merge has been tested, Scripts > push-merge.sh is called on the original VM to push the changes to the develop branch on GitHub.

Note that the example above targets the develop branch, but the same workflow could apply to other branches.

Scripts

A set of scripts were created in this repository to allow easily building and updating packages both manually and through automation (e.g. Jenkins).

query-packages.sh

This script can be called on most unix-based systems to query metadata on the packages built by linux-pkg. This script does not install anything on the system, so it can be run anywhere without any side effects.

setup.sh

Installs dependencies for the build framework. Needs to be run once to configure the system, before any other scripts (except query-packages.sh).

buildpkg.sh

Builds a single package. Package name must match a directory under packages/.

./buildpkg.sh <package>

The build will look at packages/<package>/config.sh for instructions on where to fetch the package from and how to build it. The build will be performed in packages/<package>/tmp/, and build artifacts for this package will be stored in the artifacts sub-directory.

Note that if the build of the package depends on build artifacts from another linux-pkg package, those will be fetched from a predetermined S3 location.

checkupdates.sh

Usage:

./checkupdates.sh <package>

This checks if a package has updates in the upstream project that haven't been pulled into the upstreams/6.0/stage branch, or if the upstreams/6.0/stage branch has commits that haven't been merged into the 6.0/stage branch.

If updates are available, the file <WORKDIR>/update-available will be created.

The intention of this script is to inform the caller whether an update job should be called for the given package.

sync-with-upstream.sh

Usage:

./sync-with-upstream.sh <package>

This script has 2 tasks:

  1. Check if the upstream project has updates that are not pulled into the upstreams/develop branch of the package, and if so then update that branch and push changes to GitHub.
  2. Merge upstreams/develop into develop and push the changes to a staging branch on GitHub, called projects/auto-update/develop/merging. Another system should use that branch to build the package, and then run the appropriate integration tests.

After testing has been completed, push-merge.sh <package> should be called on the same system to push the merge to the develop branch.

Note that the DRYRUN environment variable must be set when running this script. If DRYRUN is set to "true", then changes are not pushed to GitHub in step 1, and staged changes are pushed to projects/auto-update/develop/merging-dryrun in step 2 instead of the non-dryrun branch. The intention is that when testing changes to the logic we want to be able to run most of the logic, but without affecting the production branches.

push-merge.sh

Usage:

./push-merge.sh <package>

This must be called on a system that has previously called sync-with-upstream.sh for the same package. It will push the merge that was previously prepared by sync-with-upstream.sh to the production develop branch, after checking that the develop branch hasn't been modified since sync-with-upstream.sh was called.

Like for sync-with-upstream.sh, the DRYRUN environment variable must be set to run this script. However, the script will fail unless DRYRUN is set to "false" given that there is not much that can be tested in dry-run mode.

Environment Variables

There's a set of environment variables that can be set to modify the operation of some of the scripts defined above.

Package Definition

For each package built by this framework, there must be a file named packages/<package>/config.sh. It defines some default variables and various hooks for building the package. When buildpkg.sh is invoked for building a package, it calls load_package_config(), which sources the appropriate config.sh file and then executes the various hooks defined for the package. The bash library lib/common.sh contains various functions that can be called from the hooks or the various scripts.

Package Variables

Here is a list of variables that can be defined for a package:

Package stages and hooks

When operations are performed on a package by build or auto-update scripts, such as buildpkg.sh or sync-with-upstream.sh, those operations are usually split into high-level tasks called "stages". Some of those stages can be modified or must be defined in a package's config file, so we refer to them here as "hooks". Hooks that have a default definition are stored in the default-package-config.sh file.

Other "stages" are not meant to be modified and aren't functionally different from regular function calls, we want to give them more visibility in the build process as they are deemed as important high-levels tasks, so they are called via the stage() helper function.

Fetch (hook)

The fetch() hook is optional, as a default is provided and should be used. It is called when fetching the source code of the package to build or to update. The repository is cloned into <WORKDIR>/repo and checked out as branch repo-HEAD. If we are performing a package update, then we also fetch the upstreams/develop branch into upstream-HEAD. The default should only be overridden when not fetching the package source from git.

Prepare (hook)

The prepare() hook is optional. It is called before calling the build hook and normally installs the build dependencies for the package.

Build (hook)

The build() hook is mandatory. It is responsible for building the package and storing the build products into packages/<package>/tmp/artifacts/.

Update Upstream (hook)

The update_upstream() hook should only be defined for third party packages that can be auto-updated. It is responsible for fetching the latest upstream source code on top of branch upstream-HEAD of our fetched repository in <WORKDIR>/repo. Note that any changes should be rebased on top of the upstreams/develop branch. If changes are detected, file <WORKDIR>/upstream-updated should be created.

Merge With Upstream (hook)

The merge_with_upstream() hook is called after the update_upstream() hook when a package is updated via sync-with-upstream.sh. Whereas update_upstream() updates the upstream-HEAD branch, merge_with_upstream then merges the upstream-HEAD branch into the repo-HEAD branch. For most third-party packages this can be left unset as the default will be used. For packages that have a more complex merge strategy, such as the linux-kernel packages, this hook can be used.

Checkstyle (hook)

The checkstyle() hook is optional. It is called before building the package if -c is provided to buildpkg.sh. Note that this hook isn't currently used by our build automation and is more of a prototype for an idea.

Fetch Dependencies

fetch_dependencies is an immutable stage. It is called for fetching build artifacts from other linux-pkg packages that are required for performing the build. See the PACKAGE_DEPENDENCIES package variable for mroe info.

Store Build Info

store_build_info() is an immutable stage. It is called after the build() stage. It is responsible for storing some build info / metadata, such as the git hash used to perform the build. Some of the build info that is stored is used by build automation, so care must be exercised when modifying it.

Post Build Checks

post_build_checks() is an immutable stage. It is responsible for performing post-build checks that are common to all packages.

One of the checks verifies that each debian package produced has a copyright file associated with it in the right location. This file is used elsewhere in the product to generate the license information for the appliance. This check can be skipped for a package by defining SKIP_COPYRIGHTS_CHECK=true in its config file.

Package environment variables

In addition to any variables defined by the package itself, a few environment variables are set-up by the framework. Here is a quick list:

Package WORKDIR

Each package is being fetched, built and updated in directory linux-pkg/packages/<package>/tmp/, referred to as WORKDIR. Whenever a script is called to operate a package, the WORKDIR directory is recreated and a linux-pkg/workdir symlink is created that points to this WORKDIR.

The following sub-directories are created in WORKDIR:

The following files are created in WORKDIR:

The following files are used as status indicators in WORKDIR:

Adding new packages

When considering adding a new package, the workflow will depend on whether the package is a third-party package or in-house package.

Note:: If you are thinking of adding a new package to this framework, you should first read the Delphix Open-Source Policy.

Third-party package

Step 1. Pick a name for the package

If the package is already provided by Ubuntu, it's recommended to use the source package as the package name. You can get the source package name for a given package by running:

sudo apt update
sudo apt show <package name> | grep Source

It is possible that the source package is not provided and so the command above will not return anything, in which case you can use <package name> as the name of the package.

Once you've decided on a package name (we shall refer to it as <package>), create a directory for it: packages/<package>/.

Step 2. Create stub for config.sh

Next step is to create a new file: packages/<package>/config.sh. You can copy the template from template/config.sh. To get started, all we need to provide is info on where to fetch the upstream source code from.

If you are using an Ubuntu source package, you'll only need to specify the name of the source package:

UPSTREAM_SOURCE_PACKAGE="<source package name>"

If the upstream source code is instead to be retrieved from a git repository, then you need to provide the git details:

UPSTREAM_GIT_URL="<git url>"
UPSTREAM_GIT_BRANCH="<git branch>"

Step 3. Fetch the upstream source

Note that steps 3 to 5 are most useful when getting a third party package from an Ubuntu source package. When the third party package is fetched from git, you may simply fork the upstream repository and add an upstreams/develop branch that points to the develop branch; you can then update DEFAULT_PACKAGE_GIT_URL in config.sh to your forked git repository and skip to step 6.

You can fetch the upstream source code from an Ubuntu source package by running:

cd packages/<package>/tmp/
mkdir source
cd source
apt-get source <upstream-source-package>
cd ..
mv source/"<upstream-source-package>"*/ repo
cd repo
git init
git checkout -b repo-HEAD
git add -f .
git commit -m '<insert commit message here>'

TODO: create a command that will run the steps above. It used to be done by buildpkg.sh -i, but this logic has been removed.

Step 4. Create a developer repository

The next steps will require you to provide a git repository for your local version of the package. For development purposes you should create an empty repository on github, and then put the url into config.sh. Note that the URL should start with https://.

e.g.

DEFAULT_PACKAGE_GIT_URL="https://github.com/<developer>/<package>"

Step 5. Push to your developer repository

Next step is to push the upstream code to the newly created repository to your developer repository. You should push the initial commit to both the develop branch and the upstreams/develop branch.

Step 6. Build the package

In this step you'll need to define a few hooks in config.sh. In the hooks you can leverage convenience functions provided by lib/common.sh.

To build the package you'll most likely need to install some build dependencies. If that is the case, you should add a prepare() hook that will install those build dependencies. For an Ubuntu source package, those dependencies can be installed by calling install_build_deps_from_control_file(). For other packages, you can usually find the build dependencies in the project's README. It is recommended to edit the debian/control file of the package to list the required build dependencies, so that install_build_deps_from_control_file() can be used. Otherwise, you can also use the install_pkgs() lib function to install packages.

Next step is to add a build() hook. It is recommended to use the dpkg_buildpackage_default() function.

Note that if you are using an Ubuntu source package, you should now be ready to build the package.

For a package that doesn't have a debian metadata directory already defined in its source tree, you'll need to create it, and push the changes to the develop branch of your developer repository. See Common Steps > Creating debian metadirectory for more details.

Once this is all ready, you can try building the package by running:

./buildpkg.sh <package>

Step 7. Make the package auto-updatable

If you want the package to be automatically updated with upstream (strongly recommended), you'll need to add the update_upstream() hook to config.sh. You should use the following functions provided by lib/common.sh:

Step 8. Make the package official

See Common Steps > Make the package official.

Step 9. Submit a Pull Request for the new package

Once you verify that the package can be built, submit a pull request with only the /packages/<package> chages. Once merged, this change will create a Jenkins job (/linux-pkg/develop/build-package/<package>/pre-push) to build the new package as part of the automation pipeline. Without merging these changes first, we will not be able to test the automated build and integration into the appliance.

Step 10. Add package to package-lists

In a separate change, add the package to package-lists. See Common Steps > Add package to package-lists.

Step 11. Test your changes

See section Testing Your Changes.

Step 12. Submit a Pull-Request to add the package to package-lists

IMPORTANT: This is the step which will trigger integration of this package into the appliance.

In-house package

Steps for adding an in-house package are slightly different than for a third-party package.

This example assumes that the source code for the project is already present in a git repository and contains a Makefile with instructions to compile the project. If the debian metadata directory is not in the source tree, see Common Steps > Creating debian metadirectory.

Step 1. Create config.sh

We will refer to the name you picked for your package as <package>. Make sure the name doesn't conflict with an existing Ubuntu package.

You'll need to create a new directory: packages/<package>/ and add a new config.sh file in it. You can copy the template from template/config.sh. In config.sh, you'll need to define two variables:

e.g.:

DEFAULT_PACKAGE_GIT_URL="https://github.com/delphix/<package>"

Step 2. Add package hooks

If your package needs some build dependencies, you'll want to add a prepare() hook to config.sh which will install those build dependencies. It is recommended to use the install_pkgs() function provided by lib/common.sh. Next step is to add a build() hook. It is recommended to use the dpkg_buildpackage_default() function provided by lib/common.sh.

Once those hooks are set-up, you can try building your package by running:

./buildpkg.sh <package>

Step 3. Make the package official

See Common Steps > Make the package official

Step 4. Submit a Pull-Request for your changes to linux-pkg

Common Steps

Those steps apply to both third-party and in-house packages.

Creating debian metadirectory

You can refer to the Debian Maintainer Guide here. Note that packages built by gradle, such as the delphix-sso-app, do not require a debian metadirectory.

Add package to package-lists

See the Package Lists section for more info.

Make the package official

Once your new package builds and has been tested in the product, the next step is to create an official repository for it.

  1. First, you should read Delphix Open-Source Policy if you haven't already, and provide the necessary info so that a github.com/delphix/<package> repository can be created for it. You'll need to push the develop branch from your developer repository, as well as the upstreams/develop branch if it is a third-party package. Note that if you have modified develop (i.e. it diverges from upstreams/develop), you should submit your changes for review before pushing them.

  2. If this is a third-party package that is to be auto-updated by Delphix automation, you should also make sure the github.com/delphix-devops-bot user is added as a collaborator to the repository.

  3. Update DEFAULT_PACKAGE_GIT_URL in packages/<package>/config.sh to the official repository.

Testing your changes

Testing changes to an existing package

If you are not making any changes to linux-pkg, only changes to a given package managed by linux-pkg:

  1. Run git-ab-pre-push from your package's repository.

TODO: complete section

Testing changes to linux-pkg

If you are testing a newly added package to linux-pkg:

  1. Run git-ab-pre-push -b <package> from linux-pkg.

Note that this package must already have been added to /packages/ in the develop branch, so that its specific Jenkins build job exists.

Package Lists

Package lists are basically just lists of packages defined in linux-pkg. They are mainly consumed by the Jenkins build infrastructure by calling the ./query-packages.sh utility. Jenkins needs to know which packages to build and include for a given version of the Delphix appliance.

Package lists are stored under ./package-lists, in two sub-directories: build and update. The build directory contains packages that are built and consumed by the Delphix Appliance, while the update directory contains a list of packages that are automatically synced with the upstream projects.

There are two physical build lists:

There's also a virtual build list, called "linux-kernel", which lists all the linux kernel packages built by linux-pkg (one for each supported flavour of the linux kernel). You can list the contents of the virtual list by running:

./query-packages.sh list linux-kernel

There is a single update list called main.pkgs, which contains all the packages that are auto-updated nightly by Jenkins. Note that zfs is not in that list as it has a dedicated Jenkins job that tracks the upstream repository and launches as soon as there are new changes.

Most third-party packages should have an update_upstream() hook defined and be added to that list.

Versions and Branches

The framework is designed in a way to allow easy integration with the Delphix release process. The idea is that both the package build artifacts (.debs and .ddebs) and package source code should be available for each Delphix release. This should hold for both in-house and third-party packages.

Regarding the build artifacts, those should be taken care of by the existing Delphix build artifacts storage policy, available here. The relevant code for managing the build artifacts is outside of the scope of this project and lies in the devops-gate.

Regarding the source code, we expect that each package repository and the linux-pkg repository itself follows the Delphix branching policy outlined here. When creating a new branch or release for the Delphix Appliance, an external script should create the relevant branch or tag for each repository. The branch or tag should then be passed to the build in the DEFAULT_GIT_BRANCH environment variable.

Future work

When building packages for an older version of the Delphix Appliance, the build image will need to be picked accordingly. We are currently using dlpx-internal-buildserver-develop, but this will not be the case anymore once we switch to a newer Ubuntu distribution.

Contributing

All contributors are required to sign the Delphix Contributor Agreement prior to contributing code to an open source repository. This process is handled automatically by cla-assistant. Simply open a pull request and a bot will automatically check to see if you have signed the latest agreement. If not, you will be prompted to do so as part of the pull request process.

This project operates under the Delphix Code of Conduct. By participating in this project you agree to abide by its terms.

Statement of Support

This software is provided as-is, without warranty of any kind or commercial support through Delphix. See the associated license for additional details. Questions, issues, feature requests, and contributions should be directed to the community as outlined in the Delphix Community Guidelines.

License

This is code is licensed under the Apache License 2.0. Full license is available here.