Open bcmills opened 5 years ago
(CC @rsc @bradfitz @jayconrod)
I am strongly in favor of using modules for vendoring in the standard library, instead of ad-hoc copies, but I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?
Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a replace
d chacha20poly1305.
Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.
cc @ianthehat
I share some of @FiloSottile's concerns. Allowing users to override std dependencies, especially crypto dependencies, is scary. But not allowing this would be against the spirit of modules.
There are some safeguards here, but I'd love to solve this without making the module resolution and package loading logic more complicated.
Possibly dumb, low-tech idea: what if we vendor these modules without using vendoring? We could run go mod vendor
, then rename the vendor
directory to something else (internal/vendor_
) and fix up the imports? That would fix #26924 and move complexity out of cmd/go
, into tools.
what if we vendor these modules without using vendoring? We could run
go mod vendor
, then rename thevendor
directory to something else (internal/vendor_
) and fix up the imports?
We have that today, in internal/x
. We could fix #26924 using the same strategy in cmd
, if we're ok with code duplication for the x
repos. (The goal of this proposal is to reduce that duplication.)
@FiloSottile
I don't see the reason to let applications override standard library dependencies. Was this widely requested, or something we wished we had often?
It has come up at Go team summits from time to time, and while we don't see many complaints about upgrading x
repos explicitly, we do see fairly frequent complaints about binary sizes (#6853 is canonical).
The two are closely related, in that one way you can end up with an overly-large binary is by pulling in a redundant copy of an x
repo from the standard library.
Per my favorite quote from C. A. R. Hoare's Hints on Programming Language Design:
[L]isten carefully to what language users say they want, until you have an understanding of what they really want. Then find some way of achieving the latter at a small fraction of the cost of the former. This is the test of success in language design, and of progress in programming methodology.
It's much easier for us to reduce code duplication in the standard library than to reduce the outputs generated from duplicated code.
Speaking for x/crypto, I actually don't want to have to figure out if a crypto/tls internal failure is due to a
replace
d chacha20poly1305.
We probably should start requesting the output of go list -m all
in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue against x/net
, and that has ~nothing to do with the standard library.)
But consider the opposite direction, too: wouldn't it be nice for folks to be able to, say, test the interaction of TLS 1.3 with their dependencies written against 'net/http`, without having to also run a bleeding-edge version of the compiler and runtime (with any associated runtime bugs)?
Also, I don't see why we'd let users override vendored dependencies but not standard library packages. The difference between golang.org/x/crypto/chacha20poly1305 and crypto/aes should be an implementation detail of crypto/tls. Then why can one be overridden, and the other not? Feels arbitrary.
That seems like a great argument for making many of the standard-library packages thin forwarding shims around x/
packages. I may file that proposal, in fact, but it's separate from this one. 🙂
I'm definitely sympathetic to reducing code duplication and binary size, but I'm also concerned about complexity and special cases.
I think we should resolve #26924 using the internal/x
approach in order to unblock other module work for 1.13, and we should talk further about modularizing the standard library after 1.13. Just to confirm my understanding, converting cmd/vendor
to cmd/internal/x
would not significantly increase binary or SDK download size, right?
@bcmills @FiloSottile
We probably should start requesting the output of
go list -m all
in the issue template, for precisely that reason. (The same problem can occur when someone reports an issue againstx/net
, and that has ~nothing to do with the standard library.)
This is the same reaction I had when I saw the objection over updating the std lib versions. Make it easy and known how to list module versions and request it in issues. I also remember when Azure had to vendor a patched version of crypto/tls to access to hit their TLS endpoints in their SDK, then fork the world to make it reference the fork.
@jayconrod, in looking at the details of this, it seems to be by far the simplest way to make module-mode vendoring work for vendoring in the standard library.
go mod vendor
uses the import graph to figure out which packages to copy, so it's much simpler to have the module-provided packages in the import graph than to try to build some other analysis (and vendoring tool) specifically for the standard library.
I don't disagree with including go list -m all
in issue reports, but it still does not make me feel good about letting users swap chacha20poly1305 backends to crypto/tls. TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.
@bcmills I agree that this is probably the simplest way to make module-mode vendoring work in the standard library in terms of new code that needs to be written. My concern is that this makes package loading harder to reason about. In particular, this proposal means that you may get either one or two copies of some packages, depending on what version the main module requires or whether it has a replacement. We have the same risk of conflict with exported types or global state as we do today, but now the risk is more subtle.
Here's a possible alternative (with some drawbacks at the end):
std
, cmd
, and misc
, as described above.
std/
, cmd/
, or misc/
are resolved to those modules. This will require new logic, but it would replace existing GOROOT
/ GOPATH
package resolution.x/tools
, x/net
, etc.pkg/mod
directory, which would contain dependencies of the standard modules. It would be in the normal module cache format. It would act as a fallback read-only cache before going to the network.chacha20poly1305
), we would create private copies in internal/x
. I'm hoping there wouldn't be many of these, so we could still mostly avoid code duplication in binaries.
gorelease
should do.I'm realizing some drawbacks to this approach though, as I'm typing this out.
pkg/mod
would include sources for packages we don't depend on. Not sure how much this would add.x/tools
, x/net
would effectively set global minimum versions for those modules. MVS would override user requirements on older versions.
- Our dependencies on
x/tools
,x/net
would effectively set global minimum versions for those modules.
Not only that, but all of their transitive dependencies too: for example, various parts of x/net
and std
currently depend on x/text
, and x/text
has a dependency on x/tools
via golang.org/x/text/cmd/gotext
and golang.org/x/text/message/pipeline
. x/tools
has a fairly broad dependency graph (see #29981), so we would be setting minimums for a lot of arbitrary modules.
Note that everything in cmd
should be a package main
, so nothing can import it. That makes the cmd
module a bit easier to handle than std
, since it mitigates the code bloat from using a vendor
directory: we know affirmatively that no binary that links against cmd
will also link against a redundant x
repo.
misc
is similar to cmd
: it does not have a globally-addressable import path, so the only time you can build anything in misc
is if it's the main module.
@FiloSottile
TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.
I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of crypto/tls
to live in the standard library at all?
That is: if it's that tightly coupled to x/crypto
, why not make it a forwarding façade over an implementation in x/crypto
? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent to x/crypto
today.
Change https://golang.org/cl/162989 mentions this issue: go/analysis: allow overriding V flag without code patches
To summarize so far: it seems that there are some concerns about the module-upgrade behavior, but no objections to the remainder of the proposal.
Since a lot of this work needs to happen early in the cycle, I'm going to go ahead and implement a more limited form: for now, we'll always resolve dependencies of the standard library to the packages in the vendor
directories, and we'll only treat them as part of the external module if the working directory is within the std
or cmd
module proper.
TLS already has too many joints, and joints in cryptography are where things break. I'd rather not to have to keep in my head an extra dimension of the compatibility matrix.
I can certainly sympathize with that, but in that case, is there a fundamental reason for the implementation of
crypto/tls
to live in the standard library at all?That is: if it's that tightly coupled to
x/crypto
, why not make it a forwarding façade over an implementation inx/crypto
? That would preserve the advantages of decoupling the release cycle and reducing code bloat, while keeping the compatibility dimensions roughly equivalent tox/crypto
today.
Making the implementation of crypto/tls
swappable is interesting, and I had explored a different approach in #21753. However, I'm not sure we want to move something so core to most applications to x/crypto
. For example, the security release process of x/crypto
is far, far weaker.
Maybe we could still make main security releases which just bump the x/crypto
dependency, but there's a lot of moving parts to this all, and I'd like to see it as a separate proposal.
(Also, I'm not convinced by the argument that a x/crypto
dependency that I don't want to be a joint makes crypto/tls
substantially different from any other standard library package.)
I'm not convinced by the argument that a
x/crypto
dependency that I don't want to be a joint makescrypto/tls
substantially different from any other standard library package.
Agreed, but I think the point generalizes: any package in std
that is tightly coupled to one in x
could be replaced with a forwarding shim that isolates the coupling within the module.
I have two questions about this proposal.
The standard library will ship with three go.mod files:
src/go.mod
(with module pathstd
),src/cmd/go.mod
(with module pathcmd
), andmisc/go.mod
(with module pathmisc
).
The standard library has a policy on which packages are allowed to import which others. It's codified in:
Have you considered following the L0/L1/L2/L3/(L4+everything else) tiers there to guide std lib module boundaries (e.g., having stdl0
, stdl1
, stdl2
, stdl3
, stdl4
modules rather than a single std
one)? Would there be benefits in doing so?
The
std
andcmd
modules differ somewhat from ordinary modules, in that they do not appear in the build list.
What is the motivation to make these modules differ from ordinary modules?
Change https://golang.org/cl/163207 mentions this issue: misc: add go.mod file
Have you considered following the L0/L1/L2/L3/(L4+everything else) tiers there to guide std lib module boundaries (e.g., having
stdl0
,stdl1
,stdl2
,stdl3
,stdl4
modules rather than a singlestd
one)? Would there be benefits in doing so?
That's an interesting idea, but one of the nice properties of this proposal is that the module boundaries follow the same rules as for ordinary module boundaries: std
encompasses the whole tree from src/go.mod
up to cmd/go.mod
. That allows us to use ordinary module-mode logic for things like vendoring when within that tree, with minimal tweaks to remove the std
prefix.
What is the motivation to make these modules differ from ordinary modules?
std
is tightly coupled to the compiler and runtime, so attempting to upgrade or replace it independent of the rest of the toolchain would be extremely error-prone.
Similarly, cmd
is tightly coupled to the compiler: it's not at all clear what things like go tool vet
should do if you've replaced cmd
with something else.
Before proceeding with the vendoring changes, I would feel a lot better if @rsc and @ianthehat LGTM'd this proposal, and I want to make sure @FiloSottile is satisfied. What do you think about holding a feedback session?
Change https://golang.org/cl/163519 mentions this issue: cmd/go: allow "stdout" and "stderr" as inputs to script_test "cp" command
Change https://golang.org/cl/164623 mentions this issue: all: move internal/x to vendor/golang.org/x and revendor using 'go mod vendor'
Change https://golang.org/cl/164618 mentions this issue: cmd: refresh cmd/vendor to match 'go mod vendor'
Change https://golang.org/cl/164619 mentions this issue: go/build: change the search order for "vendor/" paths based on srcDir
Change https://golang.org/cl/164622 mentions this issue: cmd,std: add go.mod files
Change https://golang.org/cl/164625 mentions this issue: go/internal/srcimporter: set -mod=vendor before running tests
Change https://golang.org/cl/164626 mentions this issue: cmd/go: quote expanded shell variables used within regular expressions
Change https://golang.org/cl/165377 mentions this issue: cmd/go/internal/modload: do not fetch modules in searchPackages if -mod=vendor is set
Change https://golang.org/cl/165378 mentions this issue: all: add -mod=vendor to GOFLAGS in tests that execute 'go' commands within std or cmd
Change https://golang.org/cl/165497 mentions this issue: cmd/api: use 'go list' to locate transitive dependencies of std
Change https://golang.org/cl/165749 mentions this issue: cpu: define cacheLineSize and doinit for unknown CPUs
We plan to make the standard-library vendor
directories work in 1.13, but will defer for a later release the decision on whether (or how) to allow user modules to upgrade them.
(That is: for 1.13, the std
-vendored modules, viewed from outside std
, will continue to be treated as packages in the standard library and not part of the corresponding external module.)
Change https://golang.org/cl/251159 mentions this issue: cmd/go/internal/modload: remove (*loader).forceStdVendor
Change https://golang.org/cl/254738 mentions this issue: cmd/api: omit outside dependencies when listing the packages in "std"
any progress for this proposal?
We've recently had a case that touches @FiloSottile's comment. Working with a broken device containing borked certificate we needed to modify crptoybtes
(https://groups.google.com/u/1/g/golang-nuts/c/wlhj5RFXh9g). Unfortunately it was a huge endeavour to apply this patch consistently throughout local, ci and docker build. We are aware that this patch is not suitable for upstream or any other purpose but it would have been convenient to have a simpler way of applying it.
Is there anything that could move this proposal into the active column of the proposals project?
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
/cc @matloob @bcmills
It sounds like people are generally OK with this. There is some question about whether there are problems with x/crypto being too tied to the main repo, but if so we should find and fix those.
Are there any objections to making this change?
Over the past few years we actually brought a lot from x/crypto into std, so these days only chacha20poly1305, hkdf, and cryptobyte are vendored into the standard library. This makes upgrading x/crypto both less troublesome and less useful.
I find it a bit weird to expose upgrading these specific packages but not anything else in the stdlib to applications. Why can they upgrade ChaCha20Poly1305 but not AES-GCM? Why can a X.509 parsing bug be fixed with a replace
if it's in cryptobyte but not if it's in the crypto/x509 caller? There is no answer because we never designed that boundary to be meaningful: vendoring a package was an implementation detail, not a visible decision. This change would expose it.
Looking at the full list of vendored packages, this doesn't feel arbitrary only for x/crypto.
# golang.org/x/crypto v0.0.0-20220722155217-630584e8d5aa
## explicit; go 1.17
golang.org/x/crypto/chacha20
golang.org/x/crypto/chacha20poly1305
golang.org/x/crypto/cryptobyte
golang.org/x/crypto/cryptobyte/asn1
golang.org/x/crypto/hkdf
golang.org/x/crypto/internal/poly1305
golang.org/x/crypto/internal/subtle
# golang.org/x/net v0.0.0-20220920203100-d0c6ba3f52d9
## explicit; go 1.17
golang.org/x/net/dns/dnsmessage
golang.org/x/net/http/httpguts
golang.org/x/net/http/httpproxy
golang.org/x/net/http2/hpack
golang.org/x/net/idna
golang.org/x/net/lif
golang.org/x/net/nettest
golang.org/x/net/route
# golang.org/x/sys v0.0.0-20220804214406-8e32c043e418
## explicit; go 1.17
golang.org/x/sys/cpu
# golang.org/x/text v0.3.8-0.20220722155301-d03b41800055
## explicit; go 1.17
golang.org/x/text/secure/bidirule
golang.org/x/text/transform
golang.org/x/text/unicode/bidi
golang.org/x/text/unicode/norm
In other words, it's nice if by luck a bug is fixable by replacing cryptobyte, but I think it would be more useful to talk about this in the broader context of a plan that involves carving out the runtime and compiler independent pieces of std
that folks might want to upgrade, like the whole crypto/tls
tree, net/http
, etc. That would consistently relieve pressure on backport choices and make it easier to adapt to unsupported cases without forking the whole toolchain, which feels like the actual need here.
- If the version in the build list is equal to than the one required by
std
orcmd
, that version is notreplace
d in the main module, and the package exists in the correspondingvendor
directory, then we will load the package fromvendor
orcmd/vendor
with its original import path.
Out of curiosity, shouldn't we be able to deduplicate code and get smaller binaries in this case as well? The benefits are only mentioned in the third case (build list has a newer version), but not in this second case (build list has the same version).
@FiloSottile, I agree that this proposal would be more useful if more of the standard library could be upgraded.
The process seems straightforward enough: for a given package in std
, define a corresponding package in an x/
repo, and turn the std
package into a thin forwarding shim around the x/
package. Then, upgrading the x/
package automatically upgrades the std
package as well. We would just need a consensus as to which std
libraries should become wrappers, and perhaps extend the main-repo API checks to also run for the relevant x/
packages.
To be clear, this change is not really being made just for x/crypto. It's a step toward a more rational relationship between std and x. We'll see what the next step is after we take this one.
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
No module outside GOROOT/src may declare its own module path to begin with std or cmd
Searching for "module std" filename:go.mod
on Github yields 339 results, for "module cmd" filename:go.mod
there are 485 results.
If I understand correctly, those would break.
If we want to avoid breaking them, we could use for example golang.org/std
and golang.org/cmd
as the module names instead, as those do not seem to be used now.
std
and cmd
are already special names, as documented at https://pkg.go.dev/cmd/go#hdr-Package_lists_and_patterns. I'm a bit surprised that it works to use those as module names.
This proposal is both a fix for #26924, and a means for users to explicitly upgrade the
golang.org/x
dependencies of the standard library (such asx/crypto
andx/net
).Proposal
The standard library will ship with three go.mod files:
src/go.mod
(with module pathstd
),src/cmd/go.mod
(with module pathcmd
), andmisc/go.mod
(with module pathmisc
).The
std
andcmd
modules will havevendor
directories, containingmodules.txt
files in the usual format and managed using the standard Go 1.13 vendoring semantics (hopefully #30240).The
std
andcmd
modules differ somewhat from ordinary modules, in that they do not appear in the build list.GOROOT/src
may declare its own module path to begin withstd
orcmd
.require
,replace
, orexclude
any module whose path begins withstd
orcmd
. (Their versions are fixed to the Go release in use.)For files within
GOROOT/src
only, when we resolve an import of a package provided by an external module, we will first check the build list for that module.vendor
orcmd/vendor
.std
orcmd
, that version is notreplace
d in the main module, and the package exists in the correspondingvendor
directory, then we will load the package fromvendor
orcmd/vendor
with its original import path.std
orcmd
itself (that is, if the user is working withinGOROOT/src
).GOROOT/src/vendor
andGOROOT/src/cmd/vendor
will remain pristine and unmodified.std
orcmd
, or if the version is equal and the module isreplace
d, then we will load the package from the module cache in the usual way.