golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.37k stars 17.71k forks source link

cmd/go: track tool dependencies in go.mod #48429

Open mtibben opened 3 years ago

mtibben commented 3 years ago

UPDATE: 2024-07-29: the latest proposal can be found here.


Background

The current best-practice to track tool dependencies for a module is to add a tools.go file to your module that includes import statements for the tools of interest. This has been extensively discussed in #25922 and is the recommended approach in the Modules FAQ

This approach works, but managing the tool dependencies still feels like a missing piece in the go mod toolchain. For example, the instructions for getting a user set up with a new project using gqlgen (a codegen tool) looks like this

# Initialise a new go module
mkdir example
cd example
go mod init example

# Add gqlgen as a tool
printf '// +build tools\npackage tools\nimport _ "github.com/99designs/gqlgen"' | gofmt > tools.go
go mod tidy

# Initialise gqlgen config and generate models
go run github.com/99designs/gqlgen init

The printf line above really stands out as an arbitrary command to "add a tool" and reflects a poor developer experience when managing tools. For example, an immediate problem is that the printf line will only work on unix systems and not windows. And what happens if tools.go already exists?

So while we have some excellent tools for managing dependencies within the go.mod file using go get and go mod edit, there is no such equivalent for managing tools in the tools.go file.

Proposed Solution

The go.mod file uses the // indirect comment to track some dependencies. An // indirect comment indicates that no package from the required module is directly imported by any package in the main module (source).

I propose that this same mechanism be used to add tool dependencies, using a // tool comment.

Users could add a tool with something like

go get -tool github.com/99designs/gqlgen@v0.14.0

or

go mod edit -require=github.com/99designs/gqlgen -tool

A go.mod would then look something like

module example

go 1.17

require (
    github.com/99designs/gqlgen v0.14.0 // tool
)

And would allow users to subsequently run the tool with go run github.com/99designs/gqlgen

This would mean a separate tools.go file would no longer be required as the tool dependency is tracked in the go.mod file.

Go modules would be "tool" aware. For example:

perj commented 1 year ago

Overall, this seems useful, the tools.go situation has always been an annoyance to me. I tend to put go run github.xom/x/y in my go:generate comments and thus rely on go.mod for the version info. I don't really mind typing go run github.com/x/y in my shell either but shortcuts are interesting, especially when working with people not that familiar with go.

The proposal so far doesn't mention any go mod edit commands to edit run lines, can I assume those will be added?

Finally, a bit of devil's advocate... currently I can create a vendor/hello/ directory and use go run hello to run it. How does this proposal address that versus the shortcuts? I'm not doing that in practice, but it's possible.

ConradIrwin commented 1 year ago

@perj Good to know that this would be useful to you! And good call on go mod edit; we should add go mod edit -run and go mod edit -droprun.

I hadn't realized about /vendor/x – if you have both (and they point to different things) the ones in go.mod should take precedence; otherwise if you have a go.mod file with run directives when you go mod vendor the behaviour of the directive could change.

mcandre commented 1 year ago

Meanwhile I have published a basic CLI tool to pin Go dev tool versions.

https://github.com/mcandre/accio

As much as I enjoy contributing developer tools, I would prefer to be able to deprecate this workaround and just use the builtin go mod system.

(Would also love to be able to ditch modvendor, and have go mod vendor stop deleting critical cgo source files, for the same reason. But that's uh off topic for this discussion )

ConradIrwin commented 1 year ago

@mcandre Thanks for sharing! Would the change proposed in https://github.com/golang/go/issues/48429#issuecomment-1415058683 give you the benefit you get from accio?

I notice it takes quite a different approach (installing specific dependency versions into the path, vs requiring go run toolname), but I'm hoping that this change would solve the same problem (having the tools you need to collaborate on a module at your fingertups) albeit in a different way.

bwplotka commented 1 year ago

Thanks for the proposal @ConradIrwin and others! Happy to see this moving forward 💪🏽 Similar to @mcandre I would love to switch to native solution. In the meantime https://github.com/bwplotka/bingo is still up to date and got quite some traction.

My 2c of the discussion so far, assuming we iterate over @ConradIrwin proposal:

  1. Having some aliasing is useful, although majority of those tools are used in Makefiles at the end, so we could do it in a separate iteration of the proposal. a. Side question: Can I track, within this proposal, multiple tools from same Go module under different version? (Example use case: e2e test that runs through multiple Go binaries of different minor versions).
  2. I like the version pinning potential and reusing go.mod semantics.
  3. If we are meant to use normal go.mod, I am worried about the dependency hell, so reusing the same MVC/dependency graph as the module. I know @bcmills mentioned it's not trivial, but I think it's must-have. IMO we don't want any Go modules importing my module to have pain of downloading and match all the dependencies for the tools that are not needed to compile my code. Furthermore, while this is probably controversial, many tools require custom replace directives (incompatibilities still) and users don't want to spent hours to craft dependencies of 10 tools to strictly work under the same deps together for no good benefit (unless I miss the benefits, other then downloading less overall?).

Also agree with majority of @leighmcculloch comment, except this one:

I think the main thing these proposals are adding is aliasing shorter names to paths.

If that's true, then we might be missing the point. To me the main problem of the issue we are trying to solve here is to be able to save and track versions of tools and its dependencies (including potential replace directives) in declarative way for the portability of the project development. I think it has to be a separate dependency graph as mentioned above.

ConradIrwin commented 1 year ago

@bwplotka I strongly agree that the main issue to solve is versioning (though I think aliases help, I'd be happy to defer to a second round too). Responding to other points:

Can I track, within this proposal, multiple tools from same Go module under different version? (Example use case: e2e test that runs through multiple Go binaries of different minor versions).

Not as proposed. Similar to how it works for go libraries, you can only have one version of a given module in your go.mod.

IMO we don't want any Go modules importing my module to have pain of downloading and match all the dependencies for the tools that are not needed to compile my code

Agreed! With this proposal if your module depends on a module that has run directives, you will not inherit the dependencies of those run directives in your go.mod (this is different to how it works if you have a tools.go in a depended-on package today).

That said, we still maintain the invariant that your go.mod can only have one version of each module; so if a tool you use depends on a module that your code also depends on, you must chose one version that works with both your code and the tool. (Similarly if multiple tools depend on the same dependency, it must resolve to exactly one version).

There are some advantages to this: you can be sure that a tool used in go:generate has the same version as any library it includes in your code, and you can control the dependencies of your tools with require and replace directives exactly as you do for your own code. Would this be sufficient to solve the problems you've experienced?

(It's maybe worth noting that this proposal does not aim to solve the problem for every tool. You call out prometheus in the bingo blog post, but as you cannot go run github.com/prometheus/prometheus/cmd/prometheus@latest, this proposal isn't directly trying to make it work. Interestingly, you can force it to work by copy-pasting replace directives, but it's probably best to follow their installation instructions).

mcandre commented 1 year ago

accio has never gained traction, too bad.

Unclear where bingo pins the version information. One of those tools that seems to discourage directly editing text file configurations.

I love Mage, and I use it to write the various development / build tasks for all my Go projects. But being a third party Go dependency, Mage itself requires pinning and a manual go install... command.

If Mage is the one and only dev dependency for a project, then it's reasonable to ask contributors, and CI/CD pipelines, and Docker images, etc. etc., to run just the one go install... command. But that means we're not using Go linter dependencies. That means we're not using Snyk CLI. That means we're not using a lot of goodies. So the list of dev dependencies is expected to grow, and we need something to automate that provisioning step. So Mage not a complete solution to our immediate problem.

Do we use sh? Require all development work to be done nested inside a Docker container? Use Ansible, God forbid?

What about writing a go-requirements.txt file with the go install... commands spelled out, one per line? It could even have chmod bits for execution in UNIX terminals. Dot slash to run on UNIX, dot backslash to run on PowerShell. Give legacy Command Prompt users the finger. Naturally, comment syntax may not necessarily be portable across all the world's different shell interpreters.

Today, the cleanest, most portable solution involves writing a (POSIX) Makefile:

.PHONY: all

all:
    go install github.com/mcandre/karp/cmd/karp@v0.0.7
    # ...

And provision the dev environment by running make.

This is perhaps the least gross workaround to go mod's lack of dev tool tracking.

Yes, make represents yet another tool, growing the technology stack. But at least (GNU/BSD) make are reasonably common companions with Go projects, especially Cgo projects. So it's not an entirely new entity.

However, makefiles are even harder to write correctly than shell scripts. There are going to be people who try to put in more than go install... or cargo install... command string literals. Any single quotes, glob asterisks, dollar signs, or file paths, will immediately break things for Go developers in Command Prompt, PowerShell, fish, BSD (t)csh, Alpine Linux, toybox/busybox, etc. etc. People will try to use (GNU) findutils in their makefiles. People will try to do way too much, when all we need is a very literal list of commands. And makefile linters are few and far between.

I'm actually planning on writing such a makefile linter to encourage extreme portability, limiting the damage.

But preferably I would prefer not to need makefiles at all, and rely solely on first party go mod to do the right thing.

ConradIrwin commented 1 year ago

@mcandre that all makes sense – and I strongly agree with your frustrations!

Do you think the proposed solution above would "do the right thing" in your mind, or are there deficiencies we should try and address? (to summarize):

This comes with a secondary benefit of not polluting your $PATH, so that different projects which rely on different versions of karp can co-exist (though unlikely to be an issue in this case, it can be for linters and tools used with go:generate).

nightlyone commented 1 year ago

Keep in mind that the proposed solution of wrapping the invocations using go run needs a bit extra work to e.g. manage protoc plugins like gen-grpc-go as you cannot control how they are invoked.

That could be solved via a wrapper program then installed at the correct location, but that would still be a cumbersome process.

On the other hand that process works very well for many interpreted languages like python with poet, JavaScript with npms npx, etc.

So maybe the Go community will adapt there as well.

mcandre commented 1 year ago

Personally, I am not a fan of invocations like "$(npm bin)" <command ...>, bundle exec <command ...> or go run <command ...>. That requires additional typing, and often messes up invocations from noninteractive contexts, such as makefiles/magefiles.

$("npm bin)" is particularly tricky for dev environments using a Command Prompt or PowerShell interpreter. No, I do love WSL but I don't want to force my users to have to depend on WSL, if I can take quick, practical steps to provide more portable solutions that don't need it, so much the better.

But bundle exec and go run aren't much better for my purposes. I want to be able to run my Go tools as completely ordinary applications.

No, I don't want to write an alias or function for every single Go tool to automatically expand into go run <command ...>. That would be laborious. Considering that a POSIX sh family interpreter is not always the Go user's default interpreter, setting up those abbreviations becomes a polyglot nightmare.

I would prefer instead that go mod place any binaries into a fixed directory like ./.go-mod/bin or similar predictable, per-project directory structure. Then it becomes easier for users to add that directory to a PATH environment variable and run the tools as <command> ... without a prefix. No, I don't know of a perfect solution that would also allow for users to invoke Go dev dependency tools from project subdirectory CWD's. Other than overwriting Go binaries at the per-user level. No, I have not tried Go workspaces yet.

If I had to choose, I'd actually rather go mod overwrite binaries in ${GOPATH}/bin at the per-user level, as go install often does, and be left dealing with the simpler problem of per-project conflicts between versions and CLI tool names. I would prefer the manual labor of re-running go mod commands to re-overwrite with the desired binaries when context switching between Go projects, than having to use a go run prefix.

mcandre commented 1 year ago

Currentlty working around the lack of pinning by writing the equivalent go install commands in a portable makefile. I'm prototyping a (Rust lol) validator for makefiles now, called unmake.

By the way, any Go tools not tracked in go mod, are likely not scanned for potential CVE's.

The sooner that go mod implements some form of basic CLI tool pinning, the sooner that we get Snyk SCA alerts, dependabot alerts, and so on, for security concerns in our Go buildtime dependencies.

As a workaround, and complement to go mod, go install could be more proactive about detecting CVE's, exiting non-zero, refusing to build vulnerable tools, and generally behaving closer to NPM vis a vis security reports.

Naturally, go can't yank vulnerable package versions. Being decentralized and all. But we could have go refuse to process them.

perj commented 1 year ago

I would prefer instead that go mod place any binaries into a fixed directory like ./.go-mod/bin or similar predictable, per-project directory structure.

I think this sounds a lot more complicated to use. At least for some larger projects where I'm likely to cd into a subdirectory, it would require setting up direnv or similar to keep track of the path to this directory.

Go run would work from any directory inside the module, presumably. There's also precedence in that unit tests and generate are run from the package directory, not the module one.

Your suggestion could be useful too, but perhaps as a second step?

ConradIrwin commented 1 year ago

One thing that we might want to bring back from the original proposal is go install tools, so that you could do GOBIN=./.go-mod/bin go install tools.

We could also change the behaviour of go install stringer (if you had run stringer => golang.org/x/tools/cmd/stringer), which might help for the protobuf case... but I'm not sure it's really something to encourage. (And it's a bit unclear whether you want to bind your version of the protoc plugin to protoc which is globally installed, or to your codebase... Somewhere between the two?)

rsc commented 1 year ago

Talked to @bcmills and @matloob. We agree that this should be a tool line and not a run shorthand, so closer to Jay's proposal than Conrad's.

What we're confident about is:

  1. Add a tool mod/ule/path line that defines a tool, with a meaning exactly like import _ "mod/ule/path" in a tools.go file.
  2. Add the -tool flag to go get to add/remove a tool line along with a require line for the tool's module.

We are less confident about whether there should be a shorthand way to run a tool. Perhaps, but perhaps not. If we were going to do that, I think we would use go tool <name>, where name is the last element of a known tool and does not conflict with pre-installed tools. For example if your go.mod said

tool golang.org/x/tools/cmd/stringer
require golang.org/x/tools v0.9.0

then perhaps go tool stringer would be equivalent to go run golang.org/x/tools/cmd/stringer@v0.9.0. Or perhaps that would be best left for a future proposal.

Or perhaps the answer is go install tools instead of that go tool stringer, but install writes to a global directory shared by all modules, and there is significant possibility of conflict. In contrast, go tool stringer could make sure to use the correct version even as you switch between modules, running it directly from the build cache.

seankhliao commented 1 year ago

Note that we previously declined running external commands under cmd/go in #48411

ConradIrwin commented 1 year ago

Great! Thank you @rsc, @bcmills and @matloob for taking the time to talk this through.

I’m happy to take on implementing that over the next few weeks.

What do you think about having go run cache binaries for tools in the current module?

This would make a shell script that did go run golang.org/x/tools/cmd/stringer faster for subsequent runs (and while not so important in this case, matters more if the tool links a larger code base).

I think it would be best to include it in this round, but we could defer it to a subsequent discussion if we want to explore exactly what should be cached and for how long in more detail.

ConradIrwin commented 1 year ago

@seankhliao i think the concerns from #48411 are more specific to the idea of go X than go tool X or go run X.

The major advantage that this mechanism has over that is that we’re not depending on the outside environment ($PATH), but the lookup is defined clearly in your go files.

I do think it would be good to make go tool X work as part of this change. Although it’s slightly less easy to type than go run X it still gives you the benefit of not having to write wrapper scripts for everything. And I like that it keeps a consistent name for a consistent functionality (versioning a tool you use to develop a go program).

The main question is handling conflicts - it is definitely safer to make sure go’s tools take precedence; and I like the simplification of not having aliases in the file. That slightly restricts the names of tools, but it’s probably ok (given that $PATH has the same uniqueness restriction). In writing I’ve convinced myself that the approach to conflict resolution outlined above is correct.

mvdan commented 1 year ago

What do you think about having go run cache binaries for tools in the current module?

For reference, that's very close to https://github.com/golang/go/issues/33468. I personally use lines like //go:generate go run pkg/to/main args..., but sometimes I admit it's just too slow. For example, with https://github.com/kyleconroy/sqlc, re-building the binary (which I assume is little more than a link step) takes about one second (time go run github.com/kyleconroy/sqlc/cmd/sqlc@v1.17.2 -h). One second isn't too bad, but it adds up towards go generate ./... chugging CPU for a while and slowing down my development.

ConradIrwin commented 1 year ago

@mvdan strongly agree! I like the idea of having go run always cache, but I noted @rsc's response on https://github.com/golang/go/issues/25416#issuecomment-401882645

To be clear, while @ianlancetaylor explained the state of the world without expressing a preference on policy, I will express a preference on policy: we don't want to start caching binaries just so that people can "go run path/to/binary" instead of installing binaries. Installing binaries is good!

I think the new observation (which may be new in the half-decade since that note!) is that installing binaries is not actually that good if you're working on multiple projects that require specific versions of tools (which is a state I find myself in when balancing work and open-source and personal side quests).

I think a reasonable first step in the right direction is to add caching for tools that are listed in go.mod (whether or not they're available as go tool X). A key insight that I think makes this both higher value and lower cost than general go run caching is that we can create a specific cache directory per module that contains each tool by name.

Once we have opt-in metrics though it would be really interesting to measure how much impact caching all instances of go run would have – I do know that at work we updated our tooling to use go build and then run the binary instead of using go run to get caching behaviour back again.

mcandre commented 1 year ago

As a workaround, I am pinning my tools using go get commands in a makefile.

make is reasonably portable, even on vintage Windows development environments. make is already commonly installed as part of a cgo development environment, and it is lightweight.

And then I lint the makefile with unmake to safeguard it.

https://github.com/mcandre/karp/blob/master/makefile

I would love to see Go tools pinned in a standard text file, similar to Python requirements-dev.txt. In fact, my projects tend to depend on Go tools + Python tools + Node.js tools, so either way, I'll still be running some go get command from a makefile or other provisioning script to tie cross-language dev dependencies together.

rsc commented 1 year ago

Enabling caching of binaries run this way using 'go tool' is fine. We still don't want to cache arbitrary 'go run'.

As far as conflicts with go tool shortname, go tool compile should always mean the Go compiler. If there is a example.net/compile tool in the go.mod file, go tool compile still means the Go compiler. For non-builtin tools, if there is more than one tool with the same final path element, say example1.net/stringer and example2.net/stringer, then go tool stringer is an error.

fzipp commented 1 year ago

As far as conflicts with go tool shortname, go tool compile should always mean the Go compiler.

Will this become a problem if someday a new builtin tool is added to the Go distribution which happens to be named "stringer"?

ConradIrwin commented 1 year ago

@fzipp It is a problem in theory, but I think it's unlikely in practice that the go team would chose to add a tool to the distribution that conflicts with a well known tool in the ecosystem.

The same theoretical problem has always existed with $PATH and in practice we all do a relatively good job of having relatively unique binary names.

ConradIrwin commented 1 year ago

I wrote up a more concrete proposal of the work that I think is needed here:

https://go.googlesource.com/proposal/+/6357993223edf8118789c2a68c85de959b0fb5ef/design/48429-go-tool-modules.md

Please leave comments here (or in Gerritt): https://go-review.googlesource.com/c/proposal/+/495555/1.

willfaught commented 1 year ago

Why not require the full path in the go tool command? Then it's unambiguous. It's no more verbose than the go run command:

$ go run github.com/kyleconroy/sqlc/cmd/sqlc@v1.17.2 -h
$ go tool github.com/kyleconroy/sqlc/cmd/sqlc -h
ConradIrwin commented 1 year ago

It seems reasonable to support the full path in addition to the shortname.

That said, one of the key things I'd like to improve with this change is making it easy to version tools and run them. Typing out the full path doesn't make it easy.

rsc commented 1 year ago

Saying the full path should be allowed for disambiguation, but it almost certainly won't be the common usage.

mvdan commented 1 year ago

I'd like to flag a minor point: right now, I personally tend to do //go:generate go run foo.com/some/module/cmd/tool@v1.2.3, as opposed to the old build-ignored tools.go approach, which was similar to this proposed design in that it would add foo.com/some/module to my module graph.

Like @ConradIrwin mentions in https://github.com/golang/go/issues/48429#issuecomment-1481565052, adding tool modules to the module graph with MVS can be good. For example, when using https://github.com/protocolbuffers/protobuf-go as both the Go library and the Go code generator, it's a good idea to use the same version with both.

However, in other cases I only want to use a tool independently, such as https://github.com/kyleconroy/sqlc, which is a code generator only. Its go.mod has quite a bit of stuff, and now it would be in my module graph as well. Not a huge concern in general, but it does add noise, and it might bump some of my direct or indirect dependency versions due to MVS - which is strictly speaking not necessary.

rogpeppe commented 1 year ago

One concern I have: since dependencies can be tricky and it's nice to be using a consistent tool across projects, I wonder if the tool line should be able to specify the exact version of the tool to use, instead of taking the dependencies from the main module.

Something like:

tool golang.org/x/tools/cmd/stringer@v1.2.3

When the version is specified, you get exactly that version of the command, evaluated in its own main module. Otherwise you get the version implied by the current main module.

Edit: this is pretty much exactly what @mvdan said above :)

rogpeppe commented 1 year ago

When the version is specified, you get exactly that version of the command, evaluated in its own main module.

This would follow the semantics of whatever go install $pkg@$version does. Specifically, if we changed go install to allow and respect replace directives, then tool $pkg@$version would do that too.

rogpeppe commented 1 year ago

One other thought: it's possible that there might be name conflicts between tools (for example we might want to use two different tools named generate from different projects). We could consider allowing a two-argument form of tool which specifies the command name of the tool.

For example:

tool foo_generate foo.com/cmd/generate@v0.1.2
tool bar_generate bar.org/cmd/generate

This would also mean that it would be possible to specify more meaningful, locally relevant, or stable names than those provided by the author of the tool.

ConradIrwin commented 1 year ago

@mvdan @rogpeppe re: being able to run a tool as its own main module.

We discussed this a bit above, and it's definitely something that people do today, and to be clear, it's not something that will change when this proposal is accepted (you will still be able to do //go:generate go run foo.com/some/module/cmd/tool@v1.2.3 and ignore the tool support).

As you mention there are some tools where it's important that the versions match; and there are other tools where it doesn't matter so much.

The primary downside to adding support for both is that it results in a subtle distinction: how should anyone decide whether they want tool x/y@z or tool x/y? (And how would we add support for go get -tool x/y@z that allows for both; would it make sense to allow go get tools still?).

There are few tools (maybe none) where it is important that they are not added to your dependency graph, so I think the right approach is to only support versioning tools and their dependencies explicitly for now. (The other thing this change makes better is you can be sure that anyone depending on your module will not inherit your tools' dependencies; so any increase in go.mod size is local).

As an aside, the other worry I have about tool x/y@z is how well build reproducibility works – the tool would presumably not be in go.sum, nor would any of the dependencies. While we have the module proxy it "should be fine", but it seems a lot less solid.

ConradIrwin commented 1 year ago

@rogpeppe re: aliases.

I like the idea of having an optional alias. It may not be a problem in practice (e.g. $PATH is a global namespace), but it seems like a potentially better solution than just disallowing conflicts.

The main downside is parsing complexity in go.mod, but it doesn't seem that complicated.

rogpeppe commented 1 year ago

@ConradIrwin

We discussed this a bit above, and it's definitely something that people do today, and to be clear, it's not something that will change when this proposal is accepted (you will still be able to do //go:generate go run foo.com/some/module/cmd/tool@v1.2.3 and ignore the tool support).

If you do that, then you won't (presumably) be able to take advantage of the support for go tool $tool sugar, which would be a bit frustrating: you follow "good practice" for a reproducible tool dependency and you pay a price for it because you now need to explicitly mention the entire path and version in every go:generate directive.

The primary downside to adding support for both is that it results in a subtle distinction: how should anyone decide whether they want tool x/y@z or tool x/y? (And how would we add support for go get -tool x/y@z that allows for both; would it make sense to allow go get tools still?).

I'd suggest that the @z form should be the default, and some other variant form (e.g. -main) would be used to explicitly bring the tool into your actual main module dependencies.

There are few tools (maybe none) where it is important that they are not added to your dependency graph

I tend to disagree. I think that keeping dependency graphs small is important, and any help we can provide in that direction is good. Also, as I said earlier, I think it's nice to be using a consistent build for a tool so different projects aren't all using slightly different variants by virtue of being included in different dependency graphs.

As an aside, the other worry I have about tool x/y@z is how well build reproducibility works – the tool would presumably not be in go.sum

Is there any particular reason why it and its dependencies couldn't be in go.sum? I can't think of one for now.

ConradIrwin commented 1 year ago

@rogpeppe re " I think that keeping dependency graphs small is important". I'm curious why this is important to you?

I think it's nice to be using a consistent build for a tool so different projects aren't all using slightly different variants by virtue of being included in different dependency graphs.

Yes, I agree with you here; it would be ideal for the tool author to have control of the versions of dependencies used (and as you say, in order to make this really possible, we'd want to use the replace directive's from the tool's go.mod – c.f. the reasoning given here: https://golangci-lint.run/usage/install/#install-from-source). There's always going to be some tension though, as the tool user will always have the ability to override if the need to.

Supporting this extra case does complicate things though, and I'm not sure the complication is worth it to pander to tool authors (who, as in the case of golangci-lint already recommend using a different mechanism).

The challenges I think are:

It may be that we want to take on the extra cost for the benefit of tool authors, and although we probably would need a separate discussion around fixing tools with replace directives, it does seem sensible to fix both around the same time.

Is there any particular reason why it and its dependencies couldn't be in go.sum?

They could be I suppose. We'd just need to teach the tools that maintain go.sum about a new set of dependencies that are not listed in go.mod the same way. (It goes back to the first of the three challenges above).

It seems like what it really comes down to is: is the added complexity in implementation and user interface worth the added benefits? I can definitely see arguments in either direction, but curious if there are other thoughts in either direction?

rsc commented 1 year ago

Got sidetracked by the release, back to this conversation now.

@ConradIrwin The doc you sent looks great. The only part I don't understand is the mention of a special error from 'go run'. I thought we had decided that 'go run' has nothing to do with tool lines.

@rogpeppe, I believe you've made two suggestions:

  1. Allow / encourage tool lines to specify the 'shortname' form for use with 'go tool shortname' form.
  2. Allow / encourage tool lines to specify pkg@version, keeping tools out of the module graph.

For (1), what is the case you are worried about? Today people 'go install' tools and somehow we get by with a single directory. It's difficult for me to believe that two different generator tools are going to pick the same name accidentally. (And we certainly shouldn't be encouraging them to do it on purpose.) In the rare, accidental case, falling back to 'go tool full/path' seems fine.

For (2), one of the simplifying assumptions of the go command, both in its internals and its behavior, is that there's only one build graph. So if you run commands like 'go mod graph' or 'govulncheck' or 'go list' or anything else, it will tell you about the one build graph. Allowing 'tool pkg@version' introduces a separate build graph for use with that specific command. The go command is not going to start dealing with multiple build graphs in a single command - that would be a very large amount of implementation complexity, far outweighing the entire benefit of having tool lines at all. So go tool pkg would have to just behave like 'go run pkg@version' and otherwise the tool lines would be ignored by other commands. Some of the concrete drawbacks of doing that include:

I'm sure there are more drawbacks I am not thinking of. I don't see what the benefits of this approach are. The only justification I see above is "dependencies can be tricky and it's nice to be using a consistent tool across projects". I would argue that dependencies being tricky is a reason to include the dependencies in the standard build graph, so that we can bring all our standard tools to bear on the trickiness. And I think using consistent packages across all parts of one project outweighs using a consistent build of a tool across different projects. If there's some log4j-esque problem I want to be able to 'go get log4j@latest' and know that my project is now completely safe, including 'go tool foo' invocations.

ConradIrwin commented 1 year ago

@rsc Thanks!

Currently the error message from go run is:

$ go run golang.org/x/tools/cmd/stringer
no required module provides package golang.org/x/tools/cmd/stringer; to add it:
    go get golang.org/x/tools/cmd/stringer

This is a little unhelpful, because when you run go mod tidy it will remove the requirement again.

The suggestion is just to improve the error message:

$ go run golang.org/x/tools/cmd/stringer
no required module provides package golang.org/x/tools/cmd/stringer; to add it:
    go get -tool golang.org/x/tools/cmd/stringer

This will add the tool line to go.mod and add it to your module in a way that will outlast any go mod tidy's.

After you have a tool line, go run golang.org/x/tools/cmd/stringer will work (because the module will be required) even though I expect go tool stringer to be more used.

rsc commented 1 year ago

@ConradIrwin, I think that error message change is not correct. If people are using "go run" then we shouldn't tell them to do something that is meant to be used with "go tool" but accidentally also makes "go run" work. Honestly I think the "to add it" note should probably be removed entirely. There are two possible ways to fix that error. The other is to add @version to the command line. Often that's what you want instead. We should stop presuming one solution.

ConradIrwin commented 1 year ago

Makes sense. I've removed that work from the proposal, and have filed an issue to track fixing that error https://github.com/golang/go/issues/60944

rsc commented 1 year ago

Have all concerns about this proposal been addressed?

mvdan commented 1 year ago

I don't see what the benefits of this approach are. The only justification I see above is "dependencies can be tricky and it's nice to be using a consistent tool across projects". I would argue that dependencies being tricky is a reason to include the dependencies in the standard build graph, so that we can bring all our standard tools to bear on the trickiness.

I tried to explain my reasons in https://github.com/golang/go/issues/48429#issuecomment-1551584506. Trying to answer your question more directly, using sqlc as an example of a generator tool:

I will once again say that folding the tool versioning into the main build graph can be good with some tools that do need the consistency in versioning, like protoc-gen-go with the protobuf libraries. However, with some others like sqlc, there's no such thing, and that's why I currently use go run github.com/kyleconroy/sqlc/cmd/sqlc@v1.18.0. It would be mildly unfortunate if using the proposed approach strongly encouraged adding all tool dependencies to go.mod regardless of this distinction.

ConradIrwin commented 1 year ago
  • It’s a generator whose dependencies are unrelated to my own dependencies.
  • It has quite a few of those dependencies; there’s little reason for me to stick them in my go.mod.
  • Some of those dependencies might bump my go.mod versions due to MVS. Again, I don’t see a reason to do that.

I think there are two specific concerns:

The reason is to give you knowledge of and control over the code that you run.

There’s an important question: do you need knowledge/control of your tools dependencies? As you imply, the vast majority of the time it does not seem to matter; and so “no” seems like a reasonable answer. However there will always be some cases where it does matter (either because of a runtime dependency, or a security firedrill like @rsc's log4j example). That implies we should give people tools to observe and control their tools dependencies.

Given that, there’s a question of: how should we enable this? Rather than inventing a new mechanism, re-using the existing mechanism is much better: it’s only one tool to learn (and only one set of config files in the repo :D).

Of the two downsides you mention, I am still not sure why “adding new dependencies” is a problem. It’s relatively cheap to have a few more lines in the main modules go.mod; and anyone who depends on your module will not inherit your tool’s dependencies.

I am more sympathetic to the “changing versions of shared dependencies” – that can sometimes make a difference (in the case that the tool or your app depends on a library that contains a change that breaks the tool or your app). In that case updating the library as part of installing a tool is going to surface a problem. It’s worth noting that the breaking change and the incompatibility is a problem that already existed, if the tool and the app both want to keep their dependencies up to date (which they probably do), the incompatibility will need fixing at some point – all this does is make you aware of it now.

You don’t have to fix any problems that arise immediately. You can continue using go install or go run instead, or use a replace to incorporate a local fix or an exclude or require to select a better version.

As an experiment, I added sqlc@1.18.0 to my companies backend repository. We previously had 201 dependencies listed in go.mod, installing this added 13 new ones, updated 1 that our app used, and 4 that other tools in our tools.go used. It didn’t cause any problems.

It's better to give people knowledge and control over the code they run, and the extra data tracked in go.mod and go.sum is a small price to pay.

thepudds commented 1 year ago

FWIW, I am simultaneously:

Part of Go's mission is for software that scales. I’m concerned this proposal (alone and as is) would likely end up unnecessarily exacerbating some challenges faced especially by medium to large Go projects. One sample writeup is in https://github.com/golang/go/issues/52296#issuecomment-1097143694 and elsewhere in that issue (which in summary describes how large projects in particular can frequently have some indirect dependencies deep in the dependency graph that forces the whole dependency chain to update to a problematic version of something, which often can be traced to the blending of dependencies that occurred via a tools.go file, which then causes unnecessary challenges, including for consumers of the large projects. Module pruning helps but does not eliminate such problems).

If this proposal is currently targeting the "blend in my tool dependencies" use case, ideally it would happen at the same time as something that makes it easier to manage the use case of "keep my tool dependencies separate".

Otherwise, I worry the thumb might be pressed too hard on the scale for "blend in my tool dependencies".

(If we slice the cases slightly differently, I suspect the "I need to blend in my tool dependencies" case is less common than the "it's fine for my tool dependencies to be separate" case. Whether an individual project blends or separates their tool dependencies is probably less heavily influenced by whether they need to blend in their dependencies, and is instead likely more heavily influenced by what's easier and what's considered a "blessed" approach. Regardless of which case is more common, I think it's likely fair to say neither case is rare, and we probably don't want to overly steer people away from either case).

Prior to approving this proposal, I think it probably would be wise to explore in greater depth if the "keep my tool dependencies separate" use case could be improved contemporaneously with this proposal (either via other proposals, or by perhaps extending this proposal).

Ideally, the two cases would be on equal footing in the tooling, or at least the same rough ball park of convenience. (Ideally they would be ~tied in terms of convenience, but if you look at things like (1) initial setup vs. (2) day-to-day use vs. (3) maintenance operations such as upgrades, it's probably reasonable if for example one use case a bit less convenient to set up if the day-to-day convenience is still close to a tie).

thepudds commented 1 year ago

I don't really have a concrete suggestion, but:

Half-baked idea 1

One older thought I had was wondering if there could be an alternative "blessed" layout for tools.go that keeps dependencies separate with some small-ish tooling tweaks to make it more convenient.

For example, if someone sets things up like so:

.
├── go.mod             // my primary module
├── mycode.go
└── tools
    ├── mage
    │   ├── go.mod     // nested module for mage (dependencies separated)
    │   ├── go.sum
    │   └── tool.go
    └── stringer
        ├── go.mod     // nested module for stringer (dependencies separated)
        ├── go.sum
        └── tool.go

then the question would be what tooling changes would make that convenient?

For example, one thought I had after seeing Tim Hockin propose the -C flag was that we could try to lean into that for this use case, with something like:

Set up (one time per tool)

$ mkdir -p ./tools/mage
$ cd ./tools/mage
$ go mod init tools            
# use editor to manually create tools.go file
$ go get github.com/magefile/mage@latest

Install (from top level of my module)

$ go install -C ./tools/... <something>         # all tools to GOBIN
$ go build -C ./tools/... -o ./bin/ <something>  # all tools to some dir

Update (from top level of my module)

$ go get -C ./tools/mage github.com/magefile/mage@latest   # update one
$ go get -C ./tools/... <something>                        # update all??

...but that's not ideal, and also not fully specified.


Half-baked idea 2

A separate thought would be to adjust this current proposal to allow an optional tools directory in the top level of the module (parallel to a vendor directory, if any), where a go tool command would know to look inside that directory for a nested module.

The layout could be similar to idea 1, but now with no tools.go:

.
├── go.mod             // my primary module
├── mycode.go
└── tools
    ├── mage
    │   ├── go.mod     // nested module for mage (dependencies separated)
    │   └── go.sum
    └── stringer
        ├── go.mod     // nested module for stringer (dependencies separated)
        └── go.sum

Set up (one time per tool)

$ mkdir -p ./tools/mage
$ cd ./tools/mage
$ go mod init tools      # not sure module name matters? maybe not 'tools'?
$ go get -tool github.com/magefile/mage@latest

Run (from anywhere within my primary module)

$ go tool mage            # knows to consult 'tools' dir

Install (from anywhere within my primary module)

$ go install tools         # output to GOBIN
$ go build -o ./bin/ tools  # output to some dir

Update (from anywhere within my primary module)

$ go get tools                                  # update all
$ go get -tool github.com/magefile/mage@latest  # update one; would also work inside ./tools/mage

I don't know if that works... but maybe someone will have a more concrete idea.

zephyrtronium commented 1 year ago

If this proposal is currently targeting the "blend in my tool dependencies" use case, ideally it would happen at the same time as something that makes it easier to manage the use case of "keep my tool dependencies separate".

It seems like go run github.com/kyleconroy/sqlc/cmd/sqlc@v1.18.0 already implements the "keep my tool dependencies separate" case. It is the "blend in my tool dependencies" case that currently ranges from awkward to hard. Maybe I'm naïve, but wouldn't both go in a go:generate line or a script anyway, to keep arguments controlled in addition to versions? If so, I don't really see the proposal as being dramatically more or less convenient than versioned go run.

jimmyfrasche commented 1 year ago

Ideally you should be able to use and manage tools the same way whether their dependencies are intermingled or separate. The only time you should have to think about the difference is when you add the tool.

bcmills commented 1 year ago

I've been thinking about how to reduce the impact of the tool dependency on downstream consumers of your module, particularly in the context of graph pruning.

The key invariant for graph pruning is this: your go.mod file needs to explicitly record versions for everything imported by your packages and tests. That way consumers of your module can test the packages they import from your module without chasing down your whole module graph.

The all pattern includes all packages transitively imported by the packages and tests in the main module, and we use that to enforce the above invariant. If a package is in all its module must be listed explicitly in the go.mod file, and otherwise (that is, if it is a test-only dependency of an outside package) it is only listed if its selected version is not already implied by the other dependencies.

Your tool dependencies are necessary not imported by any package or test in your module. Therefore, nothing will break for your users if we leave them out, the same as we do for external test-only dependencies.

So, we have a few choices we could make here:

  1. Should each module that provides the package main for a tool be listed explicitly in the go.mod file?
    • I think it should, mainly to avoid confusion about which version of the tool is to be used.
  2. Should the modules that provide packages transitively imported by tools be listed explicitly in the go.mod file?
  3. Should tools be included in the all pattern, for commands like go test all?
  4. Should the packages transitively imported by tools be included in all?

I feel strongly that we should not break the reproducibility of go test all, but I don't have a good intuition for the other tradeoffs.

ConradIrwin commented 1 year ago

@bcmills I'm sure you've got more context on this than me, but I'm sure the answer is "yes" for 1 and 2 (that way we get all the benefits of managing tool dependencies that we want, and visibility into the versions being used, etc.).

If you use the tools.go workaround correctly (i.e. it's in its own package in your module that no-one depends on, with build tags that exclude it from ever being built), then 3 & 4 will also currently be true. I think that's a strong argument for making the answer "yes" to those questions too.

It would be possible to list tools and their dependencies in go.mod but not have them be in the "all" metapackage; but I think that will likely feel inconsistent. For example, if you do go get -u all it may update dependencies shared by tools; so go test all should test your tools and their dependencies too.

ConradIrwin commented 1 year ago

@thepudds Thanks for the link to that comment, it's good to hear problems first-hand.

I think having first-class support for versioning tools and their dependencies might be worth living with for a while before we try and extend it further. The goal of this proposal was to make an easier to use alternative to the tools.go workaround.

The right solution to keeping dependencies separate might be more general: in a project with multiple build targets, how do you delay updating the dependencies of one while updating the dependencies of another? That doesn't seem like a tool-specific problem (though it may feel more annoying with tools because most of the time you don't really want to care about their dependencies).

I like the direction of your ideas of re-using the go.mod solution for versioning, but having multiple of them – it avoids complicating things for the common case, but allows people an escape hatch if they need it. I am a bit skeptical that we should try and define the exact structure and bake that into go tool yet.

If you set up the project structure as you defined above (with a go.mod/go.sum per tool), you could implement something like this.

# go.mod
tool example.com/run
// example.com/run/main.go
package main

func main() {
  tool := os.Args[1]
  os.Setenv("GOMOD", "tools/" + tool + "/go.mod")
  exec.Command("go", append([]string{"tool", tool}, os.Args[2:]...)...).Run()
}

And then you could:

$ go tool run stringer

It's still not perfect, but it's a lot better than what we have today. If lots of people start using something like example.com/run then it could make sense to bring that functionality back into go tool itself.

gopherbot commented 1 year ago

Change https://go.dev/cl/508355 mentions this issue: modfile: Add support for tool lines