golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.02k stars 17.68k forks source link

cmd/go: standard-library vendoring in module mode #30241

Open bcmills opened 5 years ago

bcmills commented 5 years ago

This proposal is both a fix for #26924, and a means for users to explicitly upgrade the golang.org/x dependencies of the standard library (such as x/crypto and x/net).

Proposal

The standard library will ship with three go.mod files: src/go.mod (with module path std), src/cmd/go.mod (with module path cmd), and misc/go.mod (with module path misc).

The std and cmd modules will have vendor directories, containing modules.txt files in the usual format and managed using the standard Go 1.13 vendoring semantics (hopefully #30240).

The std and cmd modules differ somewhat from ordinary modules, in that they do not appear in the build list.

For files within GOROOT/src only, when we resolve an import of a package provided by an external module, we will first check the build list for that module.

bcmills commented 5 years ago

(CC @rsc @bradfitz @jayconrod)

FiloSottile commented 5 years ago

I am strongly in favor of using modules for vendoring in the standard library, instead of ad-hoc copies, but I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?

Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a replaced chacha20poly1305.

Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.

jayconrod commented 5 years ago

cc @ianthehat

I share some of @FiloSottile's concerns. Allowing users to override std dependencies, especially crypto dependencies, is scary. But not allowing this would be against the spirit of modules.

There are some safeguards here, but I'd love to solve this without making the module resolution and package loading logic more complicated.

Possibly dumb, low-tech idea: what if we vendor these modules without using vendoring? We could run go mod vendor, then rename the vendor directory to something else (internal/vendor_) and fix up the imports? That would fix #26924 and move complexity out of cmd/go, into tools.

bcmills commented 5 years ago

what if we vendor these modules without using vendoring? We could run go mod vendor, then rename the vendor directory to something else (internal/vendor_) and fix up the imports?

We have that today, in internal/x. We could fix #26924 using the same strategy in cmd, if we're ok with code duplication for the x repos. (The goal of this proposal is to reduce that duplication.)

bcmills commented 5 years ago

@FiloSottile

I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?

It has come up at Go team summits from time to time, and while we don't see many complaints about upgrading x repos explicitly, we do see fairly frequent complaints about binary sizes (#6853 is canonical).

The two are closely related, in that one way you can end up with an overly-large binary is by pulling in a redundant copy of an x repo from the standard library.

Per my favorite quote from C. A. R. Hoare's Hints on Programming Language Design:

[L]isten carefully to what language users say they want, until you have an understanding of what they really want. Then find some way of achieving the latter at a small fraction of the cost of the former. This is the test of success in language design, and of progress in programming methodology.

It's much easier for us to reduce code duplication in the standard library than to reduce the outputs generated from duplicated code.

bcmills commented 5 years ago

Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a replaced chacha20poly1305.

We probably should start requesting the output of go list -m all in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue against x/net, and that has ~nothing to do with the standard library.)

But consider the opposite direction, too: wouldn't it be nice for folks to be able to, say, test the interaction of TLS 1.3 with their dependencies written against 'net/http`, without having to also run a bleeding-edge version of the compiler and runtime (with any associated runtime bugs)?

bcmills commented 5 years ago

Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.

That seems like a great argument for making many of the standard-library packages thin forwarding shims around x/ packages. I may file that proposal, in fact, but it's separate from this one. 🙂

jayconrod commented 5 years ago

I'm definitely sympathetic to reducing code duplication and binary size, but I'm also concerned about complexity and special cases.

I think we should resolve #26924 using the internal/x approach in order to unblock other module work for 1.13, and we should talk further about modularizing the standard library after 1.13. Just to confirm my understanding, converting cmd/vendor to cmd/internal/x would not significantly increase binary or SDK download size, right?

kardianos commented 5 years ago

@bcmills @FiloSottile

We probably should start requesting the output of go list -m all in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue against x/net, and that has ~nothing to do with the standard library.)

This is the same reaction I had when I saw the objection over updating the std lib versions. Make it easy and known how to list module versions and request it in issues. I also remember when Azure had to vendor a patched version of crypto/tls to access to hit their TLS endpoints in their SDK, then fork the world to make it reference the fork.

bcmills commented 5 years ago

@jayconrod, in looking at the details of this, it seems to be by far the simplest way to make module-mode vendoring work for vendoring in the standard library.

go mod vendor uses the import graph to figure out which packages to copy, so it's much simpler to have the module-provided packages in the import graph than to try to build some other analysis (and vendoring tool) specifically for the standard library.

FiloSottile commented 5 years ago

I don't disagree with including go list -m all in issue reports, but it still does not make me feel good about letting users swap chacha20poly1305 backends to crypto/tls. TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

jayconrod commented 5 years ago

@bcmills I agree that this is probably the simplest way to make module-mode vendoring work in the standard library in terms of new code that needs to be written. My concern is that this makes package loading harder to reason about. In particular, this proposal means that you may get either one or two copies of some packages, depending on what version the main module requires or whether it has a replacement. We have the same risk of conflict with exported types or global state as we do today, but now the risk is more subtle.

Here's a possible alternative (with some drawbacks at the end):

I'm realizing some drawbacks to this approach though, as I'm typing this out.

bcmills commented 5 years ago
  • Our dependencies on x/tools, x/net would effectively set global minimum versions for those modules.

Not only that, but all of their transitive dependencies too: for example, various parts of x/net and std currently depend on x/text, and x/text has a dependency on x/tools via golang.org/x/text/cmd/gotext and golang.org/x/text/message/pipeline. x/tools has a fairly broad dependency graph (see #29981), so we would be setting minimums for a lot of arbitrary modules.

Note that everything in cmd should be a package main, so nothing can import it. That makes the cmd module a bit easier to handle than std, since it mitigates the code bloat from using a vendor directory: we know affirmatively that no binary that links against cmd will also link against a redundant x repo.

misc is similar to cmd: it does not have a globally-addressable import path, so the only time you can build anything in misc is if it's the main module.

bcmills commented 5 years ago

@FiloSottile

TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of crypto/tls to live in the standard library at all?

That is: if it's that tightly coupled to x/crypto, why not make it a forwarding façade over an implementation in x/crypto? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent to x/crypto today.

gopherbot commented 5 years ago

Change https://golang.org/cl/162989 mentions this issue: go/analysis: allow overriding V flag without code patches

bcmills commented 5 years ago

To summarize so far: it seems that there are some concerns about the module-upgrade behavior, but no objections to the remainder of the proposal.

Since a lot of this work needs to happen early in the cycle, I'm going to go ahead and implement a more limited form: for now, we'll always resolve dependencies of the standard library to the packages in the vendor directories, and we'll only treat them as part of the external module if the working directory is within the std or cmd module proper.

FiloSottile commented 5 years ago

TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.

I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of crypto/tls to live in the standard library at all?

That is: if it's that tightly coupled to x/crypto, why not make it a forwarding façade over an implementation in x/crypto? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent to x/crypto today.

Making the implementation of crypto/tls swappable is interesting, and I had explored a different approach in #21753. However, I'm not sure we want to move something so core to most applications to x/crypto. For example, the security release process of x/crypto is far, far weaker.

Maybe we could still make main security releases which just bump the x/crypto dependency, but there's a lot of moving parts to this all, and I'd like to see it as a separate proposal.

(Also, I'm not convinced by the argument that a x/crypto dependency that I don't want to be a joint makes crypto/tls substantially different from any other standard library package.)

bcmills commented 5 years ago

I'm not convinced by the argument that a x/crypto dependency that I don't want to be a joint makes crypto/tls substantially different from any other standard library package.

Agreed, but I think the point generalizes: any package in std that is tightly coupled to one in x could be replaced with a forwarding shim that isolates the coupling within the module.

dmitshur commented 5 years ago

I have two questions about this proposal.

The standard library will ship with three go.mod files: src/go.mod (with module path std), src/cmd/go.mod (with module path cmd), and misc/go.mod (with module path misc).

The standard library has a policy on which packages are allowed to import which others. It's codified in:

https://github.com/golang/go/blob/889aa5eb98ff2a5134f5b075525479da399e5c4f/src/go/build/deps_test.go#L23-L24

Have you considered following the L0/L1/L2/L3/(L4+everything else) tiers there to guide std lib module boundaries (e.g., having stdl0, stdl1, stdl2, stdl3, stdl4 modules rather than a single std one)? Would there be benefits in doing so?

The std and cmd modules differ somewhat from ordinary modules, in that they do not appear in the build list.

What is the motivation to make these modules differ from ordinary modules?

gopherbot commented 5 years ago

Change https://golang.org/cl/163207 mentions this issue: misc: add go.mod file

bcmills commented 5 years ago

Have you considered following the L0/L1/L2/L3/(L4+everything else) tiers there to guide std lib module boundaries (e.g., having stdl0, stdl1, stdl2, stdl3, stdl4 modules rather than a single std one)? Would there be benefits in doing so?

That's an interesting idea, but one of the nice properties of this proposal is that the module boundaries follow the same rules as for ordinary module boundaries: std encompasses the whole tree from src/go.mod up to cmd/go.mod. That allows us to use ordinary module-mode logic for things like vendoring when within that tree, with minimal tweaks to remove the std prefix.

bcmills commented 5 years ago

What is the motivation to make these modules differ from ordinary modules?

std is tightly coupled to the compiler and runtime, so attempting to upgrade or replace it independent of the rest of the toolchain would be extremely error-prone.

Similarly, cmd is tightly coupled to the compiler: it's not at all clear what things like go tool vet should do if you've replaced cmd with something else.

jayconrod commented 5 years ago

Before proceeding with the vendoring changes, I would feel a lot better if @rsc and @ianthehat LGTM'd this proposal, and I want to make sure @FiloSottile is satisfied. What do you think about holding a feedback session?

gopherbot commented 5 years ago

Change https://golang.org/cl/163519 mentions this issue: cmd/go: allow "stdout" and "stderr" as inputs to script_test "cp" command

gopherbot commented 5 years ago

Change https://golang.org/cl/164623 mentions this issue: all: move internal/x to vendor/golang.org/x and revendor using 'go mod vendor'

gopherbot commented 5 years ago

Change https://golang.org/cl/164618 mentions this issue: cmd: refresh cmd/vendor to match 'go mod vendor'

gopherbot commented 5 years ago

Change https://golang.org/cl/164619 mentions this issue: go/build: change the search order for "vendor/" paths based on srcDir

gopherbot commented 5 years ago

Change https://golang.org/cl/164622 mentions this issue: cmd,std: add go.mod files

gopherbot commented 5 years ago

Change https://golang.org/cl/164625 mentions this issue: go/internal/srcimporter: set -mod=vendor before running tests

gopherbot commented 5 years ago

Change https://golang.org/cl/164626 mentions this issue: cmd/go: quote expanded shell variables used within regular expressions

gopherbot commented 5 years ago

Change https://golang.org/cl/165377 mentions this issue: cmd/go/internal/modload: do not fetch modules in searchPackages if -mod=vendor is set

gopherbot commented 5 years ago

Change https://golang.org/cl/165378 mentions this issue: all: add -mod=vendor to GOFLAGS in tests that execute 'go' commands within std or cmd

gopherbot commented 5 years ago

Change https://golang.org/cl/165497 mentions this issue: cmd/api: use 'go list' to locate transitive dependencies of std

gopherbot commented 5 years ago

Change https://golang.org/cl/165749 mentions this issue: cpu: define cacheLineSize and doinit for unknown CPUs

bcmills commented 5 years ago

We plan to make the standard-library vendor directories work in 1.13, but will defer for a later release the decision on whether (or how) to allow user modules to upgrade them.

(That is: for 1.13, the std-vendored modules, viewed from outside std, will continue to be treated as packages in the standard library and not part of the corresponding external module.)

gopherbot commented 4 years ago

Change https://golang.org/cl/251159 mentions this issue: cmd/go/internal/modload: remove (*loader).forceStdVendor

gopherbot commented 4 years ago

Change https://golang.org/cl/254738 mentions this issue: cmd/api: omit outside dependencies when listing the packages in "std"

hunjixin commented 2 years ago

any progress for this proposal?

andig commented 2 years ago

We've recently had a case that touches @FiloSottile's comment. Working with a broken device containing borked certificate we needed to modify crptoybtes (https://groups.google.com/u/1/g/golang-nuts/c/wlhj5RFXh9g). Unfortunately it was a huge endeavour to apply this patch consistently throughout local, ci and docker build. We are aware that this patch is not suitable for upstream or any other purpose but it would have been convenient to have a simpler way of applying it.

Is there anything that could move this proposal into the active column of the proposals project?

rsc commented 2 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc commented 2 years ago

/cc @matloob @bcmills

rsc commented 2 years ago

It sounds like people are generally OK with this. There is some question about whether there are problems with x/crypto being too tied to the main repo, but if so we should find and fix those.

rsc commented 2 years ago

Are there any objections to making this change?

FiloSottile commented 2 years ago

Over the past few years we actually brought a lot from x/crypto into std, so these days only chacha20poly1305, hkdf, and cryptobyte are vendored into the standard library. This makes upgrading x/crypto both less troublesome and less useful.

I find it a bit weird to expose upgrading these specific packages but not anything else in the stdlib to applications. Why can they upgrade ChaCha20Poly1305 but not AES-GCM? Why can a X.509 parsing bug be fixed with a replace if it's in cryptobyte but not if it's in the crypto/x509 caller? There is no answer because we never designed that boundary to be meaningful: vendoring a package was an implementation detail, not a visible decision. This change would expose it.

Looking at the full list of vendored packages, this doesn't feel arbitrary only for x/crypto.

# golang.org/x/crypto v0.0.0-20220722155217-630584e8d5aa
## explicit; go 1.17
golang.org/x/crypto/chacha20
golang.org/x/crypto/chacha20poly1305
golang.org/x/crypto/cryptobyte
golang.org/x/crypto/cryptobyte/asn1
golang.org/x/crypto/hkdf
golang.org/x/crypto/internal/poly1305
golang.org/x/crypto/internal/subtle
# golang.org/x/net v0.0.0-20220920203100-d0c6ba3f52d9
## explicit; go 1.17
golang.org/x/net/dns/dnsmessage
golang.org/x/net/http/httpguts
golang.org/x/net/http/httpproxy
golang.org/x/net/http2/hpack
golang.org/x/net/idna
golang.org/x/net/lif
golang.org/x/net/nettest
golang.org/x/net/route
# golang.org/x/sys v0.0.0-20220804214406-8e32c043e418
## explicit; go 1.17
golang.org/x/sys/cpu
# golang.org/x/text v0.3.8-0.20220722155301-d03b41800055
## explicit; go 1.17
golang.org/x/text/secure/bidirule
golang.org/x/text/transform
golang.org/x/text/unicode/bidi
golang.org/x/text/unicode/norm

In other words, it's nice if by luck a bug is fixable by replacing cryptobyte, but I think it would be more useful to talk about this in the broader context of a plan that involves carving out the runtime and compiler independent pieces of std that folks might want to upgrade, like the whole crypto/tls tree, net/http, etc. That would consistently relieve pressure on backport choices and make it easier to adapt to unsupported cases without forking the whole toolchain, which feels like the actual need here.

mvdan commented 2 years ago
  • If the version in the build list is equal to than the one required by std or cmd, that version is not replaced in the main module, and the package exists in the corresponding vendor directory, then we will load the package from vendor or cmd/vendor with its original import path.

Out of curiosity, shouldn't we be able to deduplicate code and get smaller binaries in this case as well? The benefits are only mentioned in the third case (build list has a newer version), but not in this second case (build list has the same version).

bcmills commented 2 years ago

@FiloSottile, I agree that this proposal would be more useful if more of the standard library could be upgraded.

The process seems straightforward enough: for a given package in std, define a corresponding package in an x/ repo, and turn the std package into a thin forwarding shim around the x/ package. Then, upgrading the x/ package automatically upgrades the std package as well. We would just need a consensus as to which std libraries should become wrappers, and perhaps extend the main-repo API checks to also run for the relevant x/ packages.

rsc commented 2 years ago

To be clear, this change is not really being made just for x/crypto. It's a step toward a more rational relationship between std and x. We'll see what the next step is after we take this one.

rsc commented 2 years ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

martin-sucha commented 2 years ago

No module outside GOROOT/src may declare its own module path to begin with std or cmd

Searching for "module std" filename:go.mod on Github yields 339 results, for "module cmd" filename:go.mod there are 485 results.

If I understand correctly, those would break.

If we want to avoid breaking them, we could use for example golang.org/std and golang.org/cmd as the module names instead, as those do not seem to be used now.

ianlancetaylor commented 2 years ago

std and cmd are already special names, as documented at https://pkg.go.dev/cmd/go#hdr-Package_lists_and_patterns. I'm a bit surprised that it works to use those as module names.