golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.42k stars 17.71k forks source link

x/build: make a bot review CLs for common mistakes (like Tricorder) #18548

Open bradfitz opened 7 years ago

bradfitz commented 7 years ago

Tracking bug to write a bot to review CLs for common mistakes:

And whatever else we add in the future.

(forked from email thread https://groups.google.com/forum/#!topic/golang-dev/KpHMoePhg6c)

/cc @kevinburke @rsc

dsnet commented 7 years ago

What about lint and vet?

bradfitz commented 7 years ago

What about lint and vet?

Sure. I imagine we'd start with some basic things and once we have it automated, we can add more to it later. I wouldn't make lint & vet a requirement for the initial deployment.

kevinburke commented 7 years ago

IIRC @josharian has been working on getting golang/go to pass vet, but it doesn't currently, and relies on an "exemption" file for places/test files that don't pass vet on purpose.

mvdan commented 7 years ago

gofmt

Perhaps with -s?

properly-formatted references to CLs and bugs

Github has various accepted formats for these (https://help.github.com/articles/closing-issues-via-commit-messages/). I assume you want to limit it to "Fixes #n", though?

bradfitz commented 7 years ago

@mvdan, the two common mistakes I'm thinking of:

  1. people write "Fixes #nnn" when they're hacking in a subrepo, and thus Github requires the fully-qualifed "Fixes golang/go#nnn" format, since Go only uses one massive bug tracker, instead of per-repo bug trackers.
  2. people sometimes accidentally refer to CL numbers with the # sign, spamming ancient bugs. They should write "CL nnnn" or golang.org/cl/nnnn instead of "CL #nnn" or "change #nnn".
ianlancetaylor commented 7 years ago

Another common mistake is using a git SHA when they should use a CL number.

mvdan commented 7 years ago

Another thing that comes to mind is gerrit/github links that should be golang.org/issue/X or golang.org/cl/Y.

minux commented 7 years ago

I'd also like the bot to catch accidentally committing a binary, which happened at least five times (the following list is not comprehensive as I just looked at the top 12 largest files):

https://golang.org/cl/4287056 1179384 bytes https://golang.org/cl/4406044 3500057 bytes https://golang.org/cl/9560 6084208 bytes https://golang.org/cl/19968 2243120 bytes

I also remember one time we rewrote the git history to remove one accidentally committed binary file.

I once suggested that we change Gerrit setting to reject binary files bigger than a certain size, but that was rejected.

If we make such a pre-check bot, we can at least make it point out a binary file is included (and its size). This will be a warning.

Another idea about spamming old bugs, also checks for tiny issue numbers (say, less than 100), because people sometimes use #1 as bulletpoints and some test output use #X as a short "Number X", I remember runtime/pprof tests do this.

minux commented 7 years ago

On Fri, Jan 6, 2017 at 4:31 PM, Ian Lance Taylor notifications@github.com wrote:

Another common mistake is using a git SHA when they should use a CL number.

I'd like the bot to be more helpful and actually figure out the correct CL number for the git commit hash and suggest its use.

bradfitz commented 7 years ago

I also remember one time we rewrote the git history to remove one accidentally committed binary file.

I have no memory of this and am pretty sure we have never rewritten the Git history.

ianlancetaylor commented 7 years ago

We rewrote people's memory along with the git history, to simplify matters going forward.

minux commented 7 years ago

On Fri, Jan 6, 2017 at 4:57 PM, Brad Fitzpatrick notifications@github.com wrote:

I also remember one time we rewrote the git history to remove one accidentally committed binary file.

I have no memory of this and am pretty sure we have never rewritten the Git history.

Sorry, my memory is indeed off. It's not to remove a binary file, but to revert a bad merge.

https://groups.google.com/d/msg/golang-dev/2wkzNtLKWyI/bl3wVhNAIm4J

It is to remove these two commits: https://golang.org/cl/5351/, commit https://go.googlesource.com/go/+/c21f1d5ef30ff52cb42fca146a9c7161dfee5c3c https://golang.org/cl/5353/, commit https://go.googlesource.com/go/+/e5048a5eaabdb86156d8de1a97d32eb898560e50

LionNatsu commented 7 years ago

Maybe we can add some pre-check in git-codereview? After change and before mail

kevinburke commented 7 years ago

@lionnatsu from the email thread:

We could have a bot that leaves comments on Gerrit for common mistakes. I'd prefer that over CLI tooling. (Not everybody uses git-codereview, especially if we start accepting Github PRs)

LionNatsu commented 7 years ago

@kevinburke Oh, indeed.

shawnps commented 7 years ago

I'm interested in working on this (I wrote Go Report Card https://goreportcard.com/ with @hermanschaaf which does a lot of this too).

Edit: the Go Report Card reference was just to mention that I've worked on a similar problem before and could maybe borrow some code from there.

bradfitz commented 7 years ago

@shawnps, the first thing to figure out is where this runs and how, and how it interacts with Gerrit.

We already have one process polling Gerrit (the build coordinator's "watcher"), and if we revamp dev.golang.org that would be a second. This would be a third. I suspect we'll want to make this bot interact with the proposed dev.golang.org all-Github-and-Gerrit-in-RAM server, and then this bot could long-poll our new "Github/Gerrit Model Server" waiting for changes (new or updated CLs).

But in the meantime I suppose it could run manually every N minutes. Please use https://godoc.org/golang.org/x/build/gerrit which we already use. If something's missing, modify that package as needed.

You'll want to store state somewhere. It's easy to find CLs the bot's already left comments on, but the bot should probably be silent and not +1 the bug if there's nothing to say, which means you'll need to store state about which bugs you've seen and are happy about. That should probably go into Google Cloud Datastore (https://cloud.google.com/datastore/) which we already use in the Go build system (see "git grep datastore" in src/golang.org/x/build/cmd/coordinator), as opposed to files on the local filesystem. (Filesystems come and go as we destroy and recreate the world)

shawnps commented 7 years ago

@bradfitz thanks, I was just about to ask about many of the details you just listed. Taking a look at https://godoc.org/golang.org/x/build/gerrit now.

That should probably go into Google Cloud Datastore (https://cloud.google.com/datastore/) which we already use in the Go build system (see "git grep datastore" in src/golang.org/x/build/cmd/coordinator)

Found datastore reference here:

https://github.com/golang/build/blob/master/cmd/coordinator/gce.go#L119

So this particular app (coordinator) "runs on CoreOS on Google Compute Engine and manages builds using Docker containers and/or VMs as needed" - is the suggestion that the new bot would run on GCE or GAE in a similar manner to this one?

bradfitz commented 7 years ago

Yes, this would run on GCE somewhere.

I would start by making it a non-package main package that you can instantiate and run. Then make a tiny tiny package main runner program for it for development and local testing.

Once it's working, we can integrate that bot into something.

I also imagine that some helper environment will need to be described by a Dockerfile, since you'll want things like golint and go vet available.

shawnps commented 7 years ago

Okay thanks. As a first step I'm currently trying to figure out how to grab files for a specific change on Gerrit in order to run gofmt on them.

bradfitz commented 7 years ago

Okay thanks. As a first step I'm currently trying to figure out how to grab files for a specific change on Gerrit in order to run gofmt on them.

Click "Download" in the web UI for a hint:

git fetch https://go.googlesource.com/crypto refs/changes/79/34779/8 && git cherry-pick FETCH_HEAD

You can have the bot do a shallow git clone of the parent git commit, and then chrery-pick atop that.

Alternatively, there's an API to get files from Gerrit at a specific revision as a tarball. See the cmd/coordinator code:

func getSourceTgzFromGerrit(repo, rev string) (tgz []byte, err error) {
        return getSourceTgzFromURL("gerrit", repo, rev, "https://go.googlesource.com/"+repo+"/+archive/"+rev+".tar.gz")
}
shawnps commented 7 years ago

@bradfitz thanks, I had gotten as far as getting the FetchInfo:

package main

import (
    "fmt"
    "log"

    "golang.org/x/build/gerrit"
)

func main() {
    c := gerrit.NewClient("https://go-review.googlesource.com", gerrit.NoAuth)
    cis, err := c.QueryChanges("34871", gerrit.QueryChangesOpt{
        N: 5000,
        Fields: []string{
            "CURRENT_FILES",
            "CURRENT_REVISION",
        },
    })
    if err != nil {
        log.Fatal(err)
    }
    for _, r := range cis[0].Revisions {
        for _, f := range r.Fetch {
            fmt.Println(f)
        }
    }
}

(I know the cis[0] could panic if nothing is returned, this was just a quick test to see how far I could get right now)

I guess I can use this to get this piece: refs/changes/79/34779/8 and then do the cloning bit as you mentioned.

shawnps commented 7 years ago

@bradfitz let me know if that looks like I'm on the right track, will be offline soon but I'll pick this back up in the morning. I appreciate all the feedback and help, thanks!

bradfitz commented 7 years ago

Rather than do a query for "34871", you probably just want to load the detail of a single change using https://godoc.org/golang.org/x/build/gerrit#Client.GetChangeDetail

shawnps commented 7 years ago

@bradfitz any idea if there's a reason why GetChangeDetail doesn't take an optional gerrit.QueryChangesOpt? I was planning to go the tar.gz route but that requires the revision, which doesn't seem to be accessible unless you specify "CURRENT_REVISION".

bradfitz commented 7 years ago

IIRC, GetChangeDetail returns everything already, so there's no need to selectively say what you want?

If not, we can modify that package as needed.

shawnps commented 7 years ago

@bradfitz I will dig a bit more but I think I need the current revision in order to get the tarball, and current revision only appears in ChangeInfo if you request it:

https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#change-info

see "current_revision":

"Only set if the current revision is requested or if all revisions are requested."

But maybe this is my misunderstanding of how revisions work in Gerrit.

shawnps commented 7 years ago

@bradfitz I was able to get the revision info with this patch:

diff --git a/gerrit/gerrit.go b/gerrit/gerrit.go
index fbdc7a8..b983d90 100644
--- a/gerrit/gerrit.go
+++ b/gerrit/gerrit.go
@@ -345,9 +345,19 @@ func (c *Client) QueryChanges(q string, opts ...QueryChangesOpt) ([]*ChangeInfo,
 // GetChangeDetail retrieves a change with labels, detailed labels, detailed
 // accounts, and messages.
 // For the API call, see https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#get-change-detail
-func (c *Client) GetChangeDetail(changeID string) (*ChangeInfo, error) {
+func (c *Client) GetChangeDetail(changeID string, opts ...QueryChangesOpt) (*ChangeInfo, error) {
+   var opt QueryChangesOpt
+   switch len(opts) {
+   case 0:
+   case 1:
+       opt = opts[0]
+   default:
+       return nil, errors.New("only 1 option struct supported")
+   }
    var change ChangeInfo
-   err := c.do(&change, "GET", "/changes/"+changeID+"/detail")
+   err := c.do(&change, "GET", "/changes/"+changeID+"/detail", urlValues{
+       "o": opt.Fields,
+   })
    if err != nil {
        return nil, err
    }

without this, I've found ci.Revisions to be empty

shawnps commented 7 years ago

CL made here for the above diff: https://go-review.googlesource.com/#/c/34922/

shawnps commented 7 years ago

conventional subject ("pkg/name: lowercase verb, not period at end")

Wrote a quick method to check subject here:

https://play.golang.org/p/UAUOI8u_98

josharian commented 7 years ago

@shawnps can you put these in a place where it is easier / more appropriate to review? Since I looked, some offhand comments: (1) when possible, fix instead of warning? (2) check for trailing whitespace? (3) I think there's something similar in git-codereview, might want to compare. (4) please check against past commit subjects in the repo for false positives. On phone but something like git log -format='%s' should do it. Sometimes we write things like "cmd/compile, runtime: do stuff". (5) consider structuring as parser+validator, so the parser can be reused for things like auto-triage.

shawnps commented 7 years ago

@josharian

can you put these in a place where it is easier / more appropriate to review?

sounds good and I was thinking the same thing, but not really sure where to put it. Any suggestions?

(1) when possible, fix instead of warning?

You mean like return new string with the problems fixed, for example if there is a period at the end, remove it?

JayNakrani commented 7 years ago

Another potential advance feature for the bot: Validate compiler directives (see #18331). Specifically, there were two common mistakes discussed in the issue.

  1. Typo in compiler directive: golang.org/cl/34377, golang.org/cl/34378.
  2. Accidentally line-wrapped compiler directive: golang.org/cl/12271.
josharian commented 7 years ago

not really sure where to put it. Any suggestions?

Maybe wherever the existing gopherbot (github CL mention bot) is, at least for now? Not sure where that is, but someone on this thread knows. :)

You mean like return new string with the problems fixed, for example if there is a period at the end, remove it?

Yeah. I was thinking that such a package could get used by the CL review bot to suggest fixes, and also incorporated into git-codereview to automatically fix on save, the way it auto-fixes #NNNN to golang/go#NNNN. And in fact, that check (and others?) might also want to get refactored out into this shared commit message validator.

Validate compiler directives

I'm a bit hesitant to immediately start pulling in vet, compiler directive checks, lint, spellcheckers, deadcode, errcheck, etc. It's not obvious which of those belong as CL bots vs incorporated elsewhere into the build/test infrastructure. Let's start small, get it up and running and debugged and helping us, and then build on it when we have a bit more experience with it.

shawnps commented 7 years ago

@josharian @bradfitz Does making a new folder in golang.org/x/build/cmd make sense for this? Should I only push the code up when the bot is complete, or should I push small changes one by one?

shawnps commented 7 years ago

(3) I think there's something similar in git-codereview, might want to compare.

@josharian just taking a note, I think I found what you're referring to here:

https://github.com/golang/review/blob/master/git-codereview/change.go#L225

josharian commented 7 years ago

@shawnps my 2c (but prefer Brad's input over mine):

minux commented 7 years ago

Another thing worth checking is global symbols defined in assembly. In fact, I think cmd/vet should warn against that too.

e.g. #18673 and #18154.

shawnps commented 7 years ago

@bradfitz @josharian I now have a main that takes a repo and change ID as flags, gets the source and runs the subject checker as well as gofmt on changed files. Should I push this somewhere, or wait until the whole thing is finished?

minux commented 7 years ago

Does it really need the repo as a flag? It should be able to get the repo from the change ID?

shawnps commented 7 years ago

@minux Good point. Looks like I can get it from Project on the ChangeInfo. Thanks

josharian commented 7 years ago

I say we get it up and running ASAP and build incrementally on it. I don't know where the source should live, though, or where it should run; that's @bradfitz's department (or his to delegate, anyway).

haya14busa commented 7 years ago

Hi! I'm interested in this issue. I wrote https://github.com/haya14busa/reviewdog which does related things in this issue. reviewdog accepts any checkers (golint, vet, errcheck, or whatever) results, parses them, filter them with diff and post them as comments to code review service. It currently only supports GitHub as code review service, but I have a plan to support gerrit as well if needed.

It already works well for GitHub Pull Request. e.g. https://github.com/haya14busa/reviewdog/pull/63#pullrequestreview-13287340 Maybe, it's not difficult to support gerrit.

I assume this issue will start small without golint or go vet check, but when you want to run some static check and post them as comments, I think I can help it. Please let me know if you are interested in it.

hirochachacha commented 7 years ago

Is there any chance to backporting same functionalities to git-codereivew in future? Although, this is a generic solution as @kevinburke mentioned:

We could have a bot that leaves comments on Gerrit for common mistakes. I'd prefer that over CLI tooling. (Not everybody uses git-codereview, especially if we start accepting Github PRs)

In general, quick feedback is better experience than slow feedback. I want to know my errors before some notifications will come, especially before pushing.

josharian commented 7 years ago

If (as I have been advocating) this tool is built as a package, then git-codereview can use it as well as the bot. But not everyone uses git-codereview, so the bot will still be helpful. And the bot can do things like triage that git-codereview can't do.

kevinburke commented 7 years ago

hi @shawnps - anything I can help with here?

bradfitz commented 7 years ago

Let's add this functionality to the new gopherbot tool: golang.org/x/build/cmd/gopherbot

shawnps commented 7 years ago

@bradfitz sounds good, so it looks like I would define methods on gopherbot?

bradfitz commented 7 years ago

Yup. Put it in a new file, like codereview.go. For now don't try to make it automatic. I have plans for that later. For now just make it a command line option, like:

$ gopherbot --review=1234

And make it review the current version of 1234.

So the method would be like:

func (b *gopherbot) reviewGerritCL(num int) error { ... }
bradfitz commented 7 years ago

(or just review.go, since codereview.go sounds like the name of the git-codereview tool)