buildkite / feedback

Got feedback? Please let us know!
https://buildkite.com
25 stars 24 forks source link

Expose SHA of branch before/after push, when available #397

Open bjeanes opened 6 years ago

bjeanes commented 6 years ago

I.e., at least for GitHub: https://developer.github.com/v3/activity/events/types/#pushevent contains both head (the SHA being built, which is exposed via BUILDKITE_COMMIT) and before which is the SHA of the branch prior to the push.

I have a script that I'd like to use to analyse/report on just the range of commits that have been pushed (when such information is available).

Is this something that could be exposed easily as an environment variable, if/when the build is triggered from a push event?

adamcharnock commented 6 years ago

I would like this as well. It would be very useful to at least have access to the first & last commit SHA.

I'm using a monorepo setup. When a commit first comes in, the 'detect changes' script runs. Currently it can only detect the changes made in the latest commit, even if there are many more files modified in previous commits within the same push.

Based on the changes detected, the script then triggers of child builds for affected projects within the monorepo.

sj26 commented 6 years ago

There's a sneaky way to do this already for builds triggered by webhooks — you can access the whole webhook for the current build using the GraphQL API, like:

query {
  build(uuid: "123e4567-e89b-12d3-a456-426655440000") {
    source {
      ...on BuildSourceWebhook {
        payload
      }
    }
  }
}

The payload is the full JSON payload from github.com, or the repository provider.

The build uuid parameter is the $BUILDKITE_BUILD_ID provided in the build environment.

rafaelmagu commented 6 years ago

That's handy, but means I now have to write and maintain an API client in all my pipelines when a simple, agent-provided variable would've sufficed.

Close, but no cigar.

keithpitt commented 6 years ago

@rafaelmagu good thing I don't smoke!

If we added every bit of data as an environment variable, we'd have 100+ environment variables which would become pretty unwieldy after a while...

We try to keep the environment variables useful for the majority of cases, and make sure we have APIs in place so that a pipeline can follow up on extra data if we don't surface it at an ENV level.

The other problem we have is that not all source code providers are created equal, so GitHub may surface the data in webhooks, but Bitbucket won't. And not all GitHub webhooks are created equal, so some include the data we need, and some don't.

@rafaelmagu would love to hear more about what you're trying to do, and why you need the data. Understanding that helps us figure out what you need and how we can deliver it!

rafaelmagu commented 6 years ago

I understand the difficulty regarding supporting different providers. My use case:

I have a repository of Packer templates. And I'd like to only build the templates that were changed since the last Buildkite build. I'm trying to detect the changes between commits with Shell logic like:

changes=$(git diff --name-only HEAD~1 | egrep "^(scripts/common|\.buildkite|templates/base|templates/${1}|scripts/${1})")
  if [ $changes -gt 0 ]; then
    echo "--- :packer: No impacting changes detected for ${1} -- skipping build"
    exit 0
  fi

(the script above is called with a template name)

If at least buildkite-agent supported some kind of get-webhook command for the current build, I'd be able to fetch the payload to do this without needing to generate an API token, and it'd be safely sandboxed for each build.

sj26 commented 6 years ago

@rafaelmagu we do something similar as well. We use the Buildkite builds API plus some meta data and a little bit of git to look at the previous build to figure out if the packer template has changed. But it does require an api token, as you mentioned. Something like this, but we do use pipelines a bit more: https://gist.github.com/sj26/a251f1b3ec2454b07cc6ce8e79fca558

nhooyr commented 5 years ago

@keithpitt without exposing this information in an env variable, it makes it really hard for a monorepo pipeline to work because it doesn't know the last commit CI ran on.

lox commented 5 years ago

I'm not sure that even with this info it would give you that information @nhooyr. Don't you need to know the sha of the last successful build as a merge base? Otherwise you might skip failing tests from previous builds.

nhooyr commented 5 years ago

Don't mind skipping failing tests from previous builds because we revert on ci failure. Thus, in theory, if the previous commit had a failing test, its already been or will be reverted.

That is a little janky though so in the future we're planning on having a trigger pipeline that generates triggers for all the subpipelines that need to be tested based on the code that changed. This would allow us to skip failing tests in previous commits because the subpipeline will still show as failing.

nhooyr commented 5 years ago

Example script for anyone who runs into this in the future:

func base() string {
    branch := os.Getenv("BUILDKITE_BRANCH")
    defaultBranch := os.Getenv("BUILDKITE_PIPELINE_DEFAULT_BRANCH")

    if branch != "master" && branch != defaultBranch {
        // Non default and master branches always diff against the default branch
        // as in case a previous commit's tests failed, we don't want to allow merging the PR.
        return "origin/" + defaultBranch
    }

    buildID := os.Getenv("BUILDKITE_BUILD_ID")
    rebuiltID := os.Getenv("BUILDKITE_REBUILT_FROM_BUILD_ID")
    if rebuiltID != "" {
        // Webhook will only be available on previous build.
        buildID = rebuiltID
    }

    const q = `query Payload($uuid: ID) {
  build(uuid: $uuid) {
    source {
      ...on BuildSourceWebhook {
        payload
      }
    }
  }
}`
    jsonq, err := json.Marshal(map[string]interface{}{
        "query": q,
        "variables": map[string]string{
            "uuid": buildID,
        },
    })
    if err != nil {
        log.Fatalf("failed to get base for build with buildID %v: %v", buildID, err)
    }
    // See docs at https://buildkite.com/docs/apis/graphql-api
    r, err := http.NewRequest("POST", "https://graphql.buildkite.com/v1", bytes.NewReader(jsonq))
    if err != nil {
        log.Fatalf("failed to create HTTP request: %v", err)
    }
    r.Header.Set("Authorization", fmt.Sprintf("Bearer %v", os.Getenv("BUILDKITE_API_TOKEN")))

    hc := http.Client{
        Timeout: time.Second * 10,
    }

    resp, err := hc.Do(r)
    if err != nil {
        log.Fatalf("failed to perform buildkite API graphql query: %v", err)
    }
    defer resp.Body.Close()

    b, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        log.Fatalf("failed to read query response from buildkite: %v", err)
    }

    var qresp struct {
        Data struct {
            Build struct {
                Source struct {
                    Payload string `json:"payload"`
                } `json:"source"`
            } `json:"build"`
        } ` json:"data"`
    }
    err = json.Unmarshal(b, &qresp)
    if err != nil {
        log.Fatalf("failed to unmarshal query response: %v", err)
    }

    var payload struct {
        Before string `json:"before"`
    }
    err = json.Unmarshal([]byte(qresp.Data.Build.Source.Payload), &payload)
    if err != nil {
        log.Fatalf("failed to unmarshal payload: %v", err)
    }

    if payload.Before == "" {
        log.Fatalf("missing webhook payload on build and previous build")
    }

    return payload.Before
}
nhooyr commented 5 years ago

It unfortunately doesn't handle rebuilds properly though aside from the first one. Any rebuild after the first rebuild will just fail because the rebuild doesn't have any webhook information.

nhooyr commented 5 years ago

Just going to continue to enforce merge commits for now because we can just use the ancestor of a merge commit to figure out the base for a diff. That works well enough but I think this warrants a env variable given it gets complex to handle the edge cases like rebuilds.

twelvelabs commented 4 years ago

We try to keep the environment variables useful for the majority of cases

@keithpitt This feels like a pretty common use case to me. It's not unusual to want to know the range of commits for a given build (relative to the prior build). Both travis and circle surface this information:

Are y'all aware of any existing plugins or workarounds that might provide something similar? I'm finding myself having to bend over backwards trying to get a list of changed files (monorepo ✋) when git diff --name-only $COMMIT_RANGE would totally do the trick.

dominics commented 3 years ago

@twelvelabs We (I work with @rafaelmagu) eventually built https://github.com/vital-software/monofo to work around our actual problem: figuring out what to compare against when building a mono-repo.

Specifically, even if you don't want to buy into the whole monofo generator, just running npx monofo base-commit (with the standard BUILDKITE_* environment variables, and a BUILDKITE_API_ACCESS_TOKEN so we can check for previous actual builds) will spit out a commit hash that serves pretty well as the base of a TRAVIS_COMMIT_RANGE-style commit range (even though it's slightly more complicated, because it always returns a commit that has a successful build - for other reasons like snarfing artifacts)

DEBUG='*' tells you what it's doing, but the basic algorithm is:

This approach wasn't that hard to work out, and if you don't need to select a successful build and commit for the base it's might be even easier and shouldn't require Buildkite API access. In terms of downsides, it does mean some repeated work when building a feature branch repeatedly (because the commit range always covers the whole branch), we ended up tackling that separately (with caching build steps against the contents of files)

mtsgrd commented 2 years ago

The lack of a commit range feels like it makes Buildkite a non-starter for monorepo projects.

.. when git diff --name-only $COMMIT_RANGE would totally do the trick.

Exactly this