NerdWalletOSS / shepherd

A utility for applying code changes across many repositories.
Apache License 2.0
232 stars 39 forks source link
hacktoberfest

Shepherd

Illustration of a sheep

GitHub Workflow Status semantic-release: conventionalcommits npm version GitHub issues

Shepherd is a utility for applying code changes across many repositories.

For more high level context, this blog post covers the basics.

Getting started

Install the Shepherd CLI:

npm install -g @nerdwallet/shepherd

If using GitHub Enterprise, ensure the following environment variables are exported:

export SHEPHERD_GITHUB_ENTERPRISE_BASE_URL={company_github_enterprise_base_url} # e.g., github.com
export SHEPHERD_GITHUB_ENTERPRISE_URL={company_github_enterprise_url} # e.g., api.github.com/api/v3

If using ssh, ensure that your GITHUB_TOKEN is exported:

export GITHUB_TOKEN=<PAT>

Shepherd will now be available as the shepherd command in your shell:

shepherd --help
Usage: shepherd [options] [command]
...

Take a look at the tutorial for a detailed walkthrough of what Shepherd does and how it works, or read on for a higher-level and more brief look!

Go to tutorial →

Motivation for using Shepherd

Moving away from monorepos and monolithic applications has generally been a good thing for developers because it allows them to move quickly and independently from each other. However, it's easy to run into problems, especially if your code relies on shared libraries. Specifically, making a change to shared code and then trying to roll that shared code out to all consumers of that code becomes difficult:

Shepherd aims to help shift responsibility for the first three steps to the person actually making the change to the library. Since they have the best understanding of their change, they can write a code migration to automate that change and then user Shepherd to automate the process of applying that change to all relevant repos. Then the owners of the affected repos (who have the best understanding of their own code) can review and merge the changes. This process is especially efficient for teams who rely on continuous integration: automated tests can help repository owners have confidence that the code changes are working as expected.

Migration Configuration Schema

Example

A migration is declaratively specified with a shepherd.yml file called a spec. Here's an example of a migration spec that renames .eslintrc to .eslintrc.json in all NerdWallet repositories that have been modified in 2018:

id: 2018.07.16-eslintrc-json
title: Rename all .eslintrc files to .eslintrc.json
adapter:
  type: github
  search_type: code
  search_query: org:NerdWallet path:/ filename:.eslintrc
hooks:
  should_migrate:
    - ls .eslintrc # Check that this file actually exists in the repo
    - git log -1 --format=%cd | grep 2018 --silent # Only migrate things that have seen commits in 2018
  post_checkout: npm install
  apply: mv .eslintrc .eslintrc.json
  pr_message: echo 'Hey! This PR renames `.eslintrc` to `.eslintrc.json`'

Fields

Hooks

Hooks define the core functionality of a migration in Shepherd.

Requirements

Environment Variables

Shepherd exposes some context to each command via specific environment variables. Some additional enviornment variables are exposed when using the git or github adapters.

Environment Variable Default Description
SHEPHERD_REPO_DIR ~/.shepherd the absolute path to the repository being operated on. This will be the working directory when commands are executed.
SHEPHERD_DATA_DIR ~/.shepherd the absolute path to a special directory that can be used to persist state between steps. This would be useful if, for instance, a jscodeshift codemod in your apply hook generates a list of files that need human attention and you want to use that list in your pr_message hook.
SHEPHERD_BASE_BRANCH default branch the name of the branch Shepherd will set up a pull-request against. This will often, but not always, be main. Only available for apply and later steps.
SHEPHERD_MIGRATION_DIR path to migration spec the absolute path to the directory containing your migration's shepherd.yml file. This is useful if you want to include a script with your migration spec and need to reference that command in a hook. For instance, if you have a script pr.sh that will generate a PR message: my pr_message hook might look something like this: pr_message: $SHEPHERD_MIGRATION_DIR/pr.sh
SHEPHERD_GIT_REVISION (git and github adapters) is the current revision of the repository being operated on.
SHEPHERD_GITHUB_REPO_OWNER (github adapter) is the owner of the repository being operated on. For example, if operating on the repository https://github.com/NerdWalletOSS/shepherd, this would be NerdWalletOSS.
SHEPHERD_GITHUB_REPO_NAME (github adapter) is the name of the repository being operated on. For example, if operating on the repository https://github.com/NerdWalletOSS/shepherd, this would be shepherd.
SHEPHERD_GITHUB_ENTERPRISE_URL api.github.com For GitHub Enterprise, export this variable containing the company's GitHub Enterprise url (e.g., api.github.com/api/v3).
SHEPHERD_GITHUB_ENTERPRISE_BASE_URL api.github.com For GitHub Enterprise, export this variable contraining the company's GitHub Enterprise base url.
SHEPHERD_GITHUB_PROTOCOL ssh (github adapter) is the protocol to use when cloning repos. Can be https or ssh. https is useful when ssh is firewalled.

Additional Context:

When working with GitHub, the API endpoints differ based on whether you are using GitHub's public cloud service, GitHub Enterprise Cloud, or GitHub Enterprise Server (on-premises). Here’s how the endpoints vary:

In each case, the API functionality and how you interact with it are largely the same, but the base URL changes based on where your GitHub instance is hosted.

As a result, the value of SHEPHERD_GITHUB_ENTERPRISE_URL is a function of the type of GitHub service and used to support git APIs except cloning which is configurable via SHEPHERD_GITHUB_ENTERPRISE_BASE_URL. For SHEPHERD_GITHUB_ENTERPRISE_BASE_URL, while github.com works across GitHub service types, for backwards compatibility, we default to api.github.com.

Usage

Shepherd is run as follows:

shepherd <command> <migration> [options]

<migration> is the path to your migration directory containing a shepherd.yml file.

There are a number of commands that must be run to execute a migration:

By default, checkout will use the adapter to figure out which repositories to check out, and the remaining commands will operate on all checked-out repos. To only checkout a specific repo or to operate on only a subset of the checked-out repos, you can use the --repos flag, which specifies a comma-separated list of repos:

shepherd checkout path/to/migration --repos facebook/react,google/protobuf

Run shepherd --help to see all available commands and descriptions for each one.

Developing

Run npm install to install dependencies.

Shepherd is written in TypeScript, which requires compilation to JavaScript. When developing Shepherd, it's recommended to run npm run build:watch in a separate terminal. This will incrementally compile the source code as you edit it. You can then invoke the Shepherd CLI by referencing the absolute path to the compiled cli.js file:

cd ../my-other-project
../shepherd/lib/cli.js checkout path/to/migration

Shepherd currently has minimal test coverage, but we're aiming to improve that with each new PR. Tests are written with Jest and should be named in a *.test.ts alongside the file under test. To run the test suite, run npm run test.

We use ESLint to ensure a consistent coding style and to help prevent certain classes of problems. Run npm run lint to run the linter, and npm run fix-lint to automatically fix applicable problems.

Credits

  1. Logo designed by Christopher Wharton.