thought-machine / please

High-performance extensible build system for reproducible multi-language builds.
https://please.build
Apache License 2.0
2.48k stars 207 forks source link

Add support for linters #1294

Open sagikazarmark opened 4 years ago

sagikazarmark commented 4 years ago

This is just a random idea for now, but adding support for linters would be a nice feature.

In a large Go project, using linters currently requires populating the "standard" module cache which basically means downloading dependencies a second time (first time being when Please downloads them).

I can imagine creating a go_lint rule that runs linters for each package separately, but should it be a test?

As @Tatskaari pointed out on Gitter, the results of a linter might differ from test results.

From my perspective, having to download dependencies a second time complicates things (build pipeline and environment) enough to make the case for using Please harder. 😕 On the other hand, I don't see how "lint support" would look like yet.

Tatskaari commented 4 years ago

I guess we need some way to populate the module cache and share that between rules. Please doesn't use a module cache at the moment does it? Maybe we can use a file system based module proxy or something to speed up downloads. Might be an easier thing to populate. Sounds a bit fiddly.

Alternatively, perhaps go lint can work without modules? I think it would be possible to get please to build up a traditional GOPATH based tree of all the sources. Perhaps it can lint based on that rather than having it download deps based on the go.mod.

sagikazarmark commented 4 years ago

I guess we need some way to populate the module cache and share that between rules

I think that would be make using modules much easier. Keep in mind though, that copying all dependencies wouldn't be an option as they could potentially be GBs.

Alternatively, perhaps go lint can work without modules

That wouldn't work with modules not available in the GOPATH. I wouldn't rely on GOPATH for the long term, Go is moving away from it.

Tatskaari commented 4 years ago

I've got some experience with linting in Arcanist. The nice thing about arc is that you can configure a large number of linters to run and restrict them to certain files or parts of the repo. No matter how complex your repo gets, you can always lint all your changes with a simple arc lint.

Please is designed for large heterogeneous monorepos so it could be argued that linting could fall into Please's domain. I'd like more than just performance as a design goal however. What do you think of the following:

  1. Unified linting interface for developers. Just like plz test, plz lint should be able to aggregate results from multiple linters and present them in a coherent report. It should also export some machine readable report that CI systems can consume to report lint errors inline in code reviews etc.
  2. This should feel comfortable within the Please paradigm. We don't want new config files with heavy configuration.
  3. Lint rules should be extensible; linting is a very subjective area. We shouldn't be providing built in linters (except maybe for BUILD files) however it should be possible to easily plug a linter into the linting system.

Unlike tests, users shouldn't have to define lint rules throughout their build graph per target. Lint rules should apply to all sources based on some basic include and exclude rules. Alternatively, we could update the built in rules to automatically generate lint targets e.g. go_library could generate a lint rule for all it's sources. I think this goes against goal #3 though. We want to be more flexible than this.

For goal 2, Perhaps plz lint could borrow some concepts from plz test though I feel the lint rules are going to want to behave differently. Like I said, we probably want to define lint rules globally. The arc linters are defined in a reasonably complex PHP interface that makes them pretty configurable though a complex interface like this doesn't feel right in Please.

We have a build system; we can define custom binary rules that abide by some linting contract. If we can, we should find some common standard for linter input and output format. Github must have some format that allows lint actions to report lines and files for lint errors/warnings. Perhaps we can borrow that.

Once we have a binary rule that abides by our format, we're going to want some way to tell Please about it:

[lint]
go = //linters:go

However we might also want to restrict linters to certain file extensions or exclude directories. Perhaps we need a linter() rule. Instead of pointing the lint directly to the binary, we can wrap it up in:

genlint (
   name = "go",
   linter = "//tools:golint",
   include = ["*.go"],
   exclude = ["//third_party/..."]
)

Please can then pick up all the lint rules and run them against the relevant sources. The plz lint CLI should resemble that of plz test e.g. plz lint //src/... --exclude //src/vendor/... --include //src/vendor/thing:srcs and please should figure out what source files that works out to be and pass them to the linter.

We can consider if the linters should have a persistent working directory within plz-out where they can store their language specific caches. I don't think please can provide much in terms of incrementality based on the design I described above.

peterebden commented 4 years ago

FYI you can also have more complex structures in the config, for example

[lint "go"]
target = //linters:go
pattern = *.go
sagikazarmark commented 4 years ago

For aggregating the results from multiple linters, I'd take a look at reviewdog: https://github.com/reviewdog/reviewdog#input-format.

Go is once again going to make our lives harder: some linters use the goddamn go list command under the hood or often need compiled versions of the software.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any recent activity in the past 90 days. It will be closed if no further activity occurs. If you require additional support, please reply to this message. Thank you for your contributions.

peterebden commented 3 years ago

No stalebot, we're just thinking it over still.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any recent activity in the past 90 days. It will be closed if no further activity occurs. If you require additional support, please reply to this message. Thank you for your contributions.

sagikazarmark commented 3 years ago

Bump

numbsafari commented 2 years ago

I wonder if you could combine the "tools" approach with hidden rules created by build/test rules that are tagged as a linter, and include optional args in the rule to control the linting process.

e.g.

in .plzconfig

[lint "go"]
target = //third_party/go:linter

[lint "python']
target = //third_party/python:linter

[lint "k8s"]
target = //third_party/go:k8s_lint_tool

in pkg/go/mygolib/BUILD.plz

go_library(
    ... normal stuff ..., a hidden rule is automatically exposed
)

in pkg/python/mypylib/BUILD.plz

python_library(
   ...normal stuff ...
   lint_includes = ["someextrapythonconfig.ini"],
)

python_test(
  ...normal stuff...
  lint_excludes = ["mpkg/weird_test.py"] // exclude this one test because it specifically violates our linting rules for some weird reason
)

in pkg/python/myotherpylib/BUILD.plz

python_library(
  ...normal stuff...
  linter = "//third_party/py:some_other_lint_tool" // in this case, use a different lint tool for this package for whatever reason
)

in infra/k8s/myenv/BUILD.plz

kustomization(  // custom rule that generates a k8s manifest via kustomize
  name="my_kustomization",
  srcs=glob(include=["**/*.yaml"]),
)

k8s_lint(  // explicit lint tool usage that consume the generated k8s manifest and runs some logic against it
  name="my_kustomization_lint",
  srcs=[":my_kustomization"],
)

With the above, the user interface would basically be plz lint. If I don't configure any linters, nothing happens. If I configure linters, they'll run and lint everything as you'd generally expect, without me having to define specific rules or configuration unless I'm doing something odd/different. You could also define explicit lint rules for cases where you want to consume generated outputs (like k8s manifests or, say, generated OpenAPI specs or whathaveyous).

Tatskaari commented 2 years ago

@numbsafari Yeah something like that would be pretty good, but only works for simple linters. I think that's essentially what Peter played around with here, but made it a bit more first class: https://github.com/thought-machine/please/pull/2177

More complex linters like go vet touch more than one file. Ideally we'd think up an approach that includes that too but, 1) We would need to make sure any generated files are present so the tools can make sense of our code, and 2) we'd need to consider incrementality as we're no longer linting just one file at a time.