Open bcmills opened 5 years ago
My questions are more about which API's do we want to individually version, and how are we going to lay that out. For instance, the go/packages API is far more likely to need a v2 than some of the other ones, whereas the go/buildutil needs to be deprecated. Are we just going to do major versions with directories, which means a checkout has every major version of every package in it, for ever more? Or should we be separating packages into their own modules so we can version them and then prune them. There is a lot of code in there that I would like to version stamp and then delete so we don't have to worry about it but we don't break people that are already using it.
Therefore, I propose that we split 'x/tools' into two modules
That seems quite a bit more coarse than I was imagining, not that I've written anything down.
There is a lot of code in there that I would like to version stamp and then delete so we don't have to worry about it but we don't break people that are already using it.
Pruning is the interesting part, to be sure.
Pruning is (necessarily) a breaking change, which means a new module path. So we can't prune anything stable out of the master
branch until all of the supported Go releases also support modules. (We could prune stuff out on a v2
branch and leave the master
branch on v1
, but that seems more awkward than just retaining some packages we would rather not have around.)
For instance, the go/packages API is far more likely to need a v2 than some of the other ones,
If we need a v2
for one or two packages, I think we can do those packages individually. It should be fine to have a /v2
package inside a v1
module: the repo tag applies to the module, but module paths an package paths don't necessarily have to be coupled. (That is, golang.org/x/tools/go/packages/v2
is just a package inside golang/x/tools
, whereas golang.org/x/tools/v2/go/packages
is presumably a package in the golang.org/x/tools/v2
module.)
The goforward
tool I'm prototyping in https://golang.org/cl/137076 is intended to support exactly that sort of API versioning: if we can write the v1
API in terms of the v2
API, then there is basically no cost to keeping v1
around (because it contains only trivial code).
That seems quite a bit more coarse than I was imagining
It's more coarse than I was expecting too, but there isn't really a narrower split that does much good.
cmd/
is too heterogeneous to split out, but module-per-command seems too tedious to version separately.
Similarly, go/
is too heterogeneous, but splitting out subdirectories of go/
provides no clear benefit.
We could possibly drop the google.golang.org/appengine
dependency by splitting out a bunch of individual tools (godoc
, playground
, blog
, etc.), but that only prunes out the one module (it induces no transitive dependencies) and does so at the cost of adding a bunch of individual modules that would be a pain to wire back together.
Also note that we can adjust module boundaries to some extent in the future, subject to the caveat that the module containing any existing package path should require
the appropriate version of the new module that contains that path (so that we don't end up with two otherwise-compatible modules trying to provide the same package).
I am still compiling my thought about how to split this repo - but bummer is that my original motivation on adopting modules in this repo was to tidy up the master branch and freeze the old, unsupported packages as the form of v1 in the separate branch. But as @bcmills pointed out, this is a breaking change and can't be done until all supported go tools can recognize modules. On the other hand, having sources in /v2 directory is not helping for me whose driving motivation is tidyup.
cmd
hosts binaries only and changing the go get path is pretty undesirable, so except dependency management purpose, I think it's better to keep them in v1 as long as possible.
Most top-level directories can be possibly in a separate module assuming their life cycles are independent from each other. The internal
directories also complicate splitting the repo to finer grained modules.
The
internal
directories also complicate splitting the repo to finer grained modules.
FWIW, internal
directories should mostly not be a problem: packages in one module are allowed to import internal
packages from another as long as they have the same path prefix (up to the internal/
path component).
Most top-level directories can be possibly in a separate module assuming their life cycles are independent from each other.
That's true, but bear in mind that any subpackage that depends on golang.org/x/tools
itself will need a replace
directive during editing, and those directives make it much more likely that we will accidentally introduce version skew.
Until we have some mechanism in place to detect and/or avoid that version skew (possibly #26420; possibly #27542), I would prefer to keep the subdirectories in the same module. There doesn't seem to be a compelling reason to break them out yet, and we can always do that later if we discover one.
Note that golang.org/x/tools
is (de facto) currently a single module, since any go.mod
entry pointing to a commit from before we add a go.mod
file (whenever that occurs) provides all of the packages in the repository as it stands today.
We can't move any of those packages to other modules without either adding a cyclic module requirement (which is fine, but makes any dependency pruning kind of moot) or introducing an ambiguous import path (which will break builds).
The top-most internal/
directory is in the version 1(or 0) (yes, since we are currently in a single module), if a v2 of submodule dependes on the internal/... in the future, that creates dependency of v2 on the v1 module, so prevents migrating the v1 to use any of v2 for implementation.
if a v2 of submodule [depends] on the internal/... in the future, that creates dependency of v2 on the v1 module, so prevents migrating the v1 to use any of v2 for implementation.
If everything is working as intended, v1
and v2
should be able to have a cyclic dependency: v1
can import v2
for implementation, and golang.org/x/tools/v2
can implement golang.org/x/tools/internal
from v1
. The versions you actually end up with are the fix-point of the cycle.
The only hard constraint is that the same import path can't appear in more than one module in the cycle, but since v1
and v2
have separate import paths that shouldn't be an issue.
A further thought. We might want separate submodules for parts of the tree that we are ready to declare “stable” (v1) instead of “unstable” (v0).
But if we're ready to commit to API stability for anything in x/tools
, it probably belongs in the standard library or toolchain.
Are there any packages within x/tools
that are stable and not candidates for the standard toolchain?
What do you mean by toolchain or standard toolchain?
I mean the tools accessible via go tool
. (If there's a better term for that, please let me know.)
Ah, gotcha. When I hear "toolchain" I typically think assembler+compiler+linker.
Note that things accessible by go tool
are not necessarily in the "go" repo. We blend the godoc binary into releases by x/build/cmd/release, so in theory other things could live elsewhere (like in x/tools) but be shipped as a go tool. IIRC that's how vet used to work.
Come to think of it, I don't think we want (or need) a replace
directive in tip
either, and I'm not even sure we want a separate module for it.
Ideally we want go test golang.org/x/tools/cmd/tip
to produce the same results regardless of whether it is run from within x/tools
or a fresh outside module.
Change https://golang.org/cl/137815 mentions this issue: all: set GO111MODULE=off for tests that use GOPATHs in testdata.
Adding a few thoughts/questions.
The most extensive dependencies in x/tools come from x/tools/cmd/tip (via a dependency on golang.org/x/build/autocertcache), and are only needed when the autocert build tag is set.... Therefore, I propose that we split 'x/tools' into two modules: one at the repo root, and one at cmd/tip
The fact that cmd/tip
has lots of dependencies is somewhat secondary to my mind. What's more important is that it's a main
package and a standalone main
package at that (i.e. it's not a tool that is used as part of library L and hence only makes sense to version along with L). Creating modules boundaries at main
packages seems like a sensible and risk free thing to do, not least because it has the nice side effect of colocating dependencies like in the case of cmd/tip
, but also because it gives editors the chance to start depending on actual versions (e.g. the case of goimports
we brought up in on the last golang-tools call).
but module-per-command seems too tedious to version separately.
Isn't our decision in this regard more a function of what people depending on these commands need/expect? For example, if all of the commands in cmd/...
sit in the same module, we can't make breaking changes in any command without moving forward the major version for all commands, which doesn't feel right, particularly if the commands are largely unrelated (which would suggest they should be separate modules in any case).
We could postpone a decision here until such a time as any given command needs to make a breaking change, but at that point I think the creation of a submodule represents a breaking change in the root module does it not (it's equivalent to the removal of a package)?
In any case, tooling can surely help here. Because that same tooling would surely be useful to consumers of any modules in x/tools
.
Note that golang.org/x/tools is (de facto) currently a single module
I don't think there can be any argument against adding a go.mod
at the root of x/tools
today if we don't do any tagging of versions and stick in v0 pseudo version territory. But equally I don't see any huge value because I think you would need to remain at v0 pseudo versions forever (because of the submodule introduction problem mentioned above).
My questions are more about which API's do we want to individually version, and how are we going to lay that out
Regardless of whether we add a root go.mod
or not, can we not just start creating modules where the boundaries feel right? Where "boundaries" is a function of packages being related and an appropriate versioning unit, i.e. a breaking change in that unit would necessarily be separate from the packages not in the module and vice versa.
A further thought. We might want separate submodules for parts of the tree that we are ready to declare “stable” (v1) instead of “unstable” (v0).
This doesn't feel like a good reason to introduce a module boundary to my mind. Because when something becomes "stable" we're not really going to move it, are we? Instead, I'd advocate creating modules out of sets of stable, related packages.
Ideally we want go test golang.org/x/tools/cmd/tip to produce the same results regardless of whether it is run from within x/tools or a fresh outside module.
I'm not quite sure what you mean here?
I think it's better to keep them in v1 as long as possible.
The discovery of module versions >= 2
is definitely an issue. Maybe this starts to get "solved" by things like godoc.org growing support for modules. I would definitely favour avoiding breaking changes where possible, but sometimes they are unavoidable.
if all of the commands in
cmd/...
sit in the same module, we can't make breaking changes in any command without moving forward the major version for all commands, which doesn't feel right, particularly if the commands are largely unrelated
Recall that each major version needs a new import path: we can define a new module for golang.org/x/tools/cmd/godoc/v2
even if v1
remains in golang.org/x/tools
.
I think the distinction between v0
and v1
is going to cause us more trouble than a breaking change from v1
to v2
. A v1
module must not import any package from a v0
one: the v1
packages are intended to be stable over time, but the v0
packages are free to break at any time. (However, it's ok for a v1
module to require
a v0
one as long as it doesn't import any packages. That will be important later.)
So perhaps the way to proceed is to first decide which packages are stable, then break the repo into modules based on the stable/unstable split.
We could postpone a decision here until such a time as any given command needs to make a breaking change, but at that point I think the creation of a submodule represents a breaking change in the root module does it not (it's equivalent to the removal of a package)?
Creating a submodule is not a breaking change as long as the module and submodule have a cyclic requirement. (This is illustrated in the mod_get_moved
test case.)
The procedure is:
cmd/godoc
to module golang.org/x/tools/cmd/godoc
../cmd/godoc/go.mod
, add require golang.org/x/tools v0.N.0
.
require golang.org/x/tools/cmd/godoc
won't have a duplicate definition for package 'golang.org/x/tools/cmd/godoc`../go.mod
, add require golang.org/x/tools/cmd/godoc v1.0.0
require golang.org/x/tools v0.M.0
(for M
≤N
), running go get -u
on that module doesn't drop any packages.v0.N.0
and cmd/godoc/v1.0.0
.Testing that commit is a very hard problem, but actually producing it should be straightforward.
Instead, I'd advocate creating modules out of sets of stable, related packages.
Perhaps, but it's not clear that we can do that with the existing package layout.
For example, cmd/godoc
and godoc
logically go together, but we can't put them in the same module without putting all of cmd/
in that module. So we'd end up with about twice as many modules as we actually intend, and without stable-version tagging it's not clear what benefit we'd get from that.
We could postpone a decision here until such a time as any given command needs to make a breaking change, but at that point I think the creation of a submodule represents a breaking change in the root module does it not (it's equivalent to the removal of a package)?
Creating a submodule is not a breaking change as long as the module and submodule have a cyclic requirement. (This is illustrated in the
mod_get_moved
test case.)The procedure is:
In a single commit:
- Move
cmd/godoc
to modulegolang.org/x/tools/cmd/godoc
.In
./cmd/godoc/go.mod
, addrequire golang.org/x/tools v0.N.0
.
- That ensures that any module with
require golang.org/x/tools/cmd/godoc
won't have a duplicate definition for package 'golang.org/x/tools/cmd/godoc`.In
./go.mod
, addrequire golang.org/x/tools/cmd/godoc v1.0.0
- That ensures that for any module with
require golang.org/x/tools v0.M.0
(forM
≤N
), runninggo get -u
on that module doesn't drop any packages.
In this specific example, the top module is still v0 - without any backwards-compatibility guarantee. Is that the reason that this is not considered as a breaking change? Or is it because v0 and v1 are indistinguishable w.r.t. import paths? Once the module reaches v1+, removing a package can be like removing exported names
- Tag that commit as
v0.N.0
andcmd/godoc/v1.0.0
.- Push the commit. Testing that commit is a very hard problem, but actually producing it should be straightforward.
I couldn't manage to get the package splitting working yet after several attempts. Either it's not as straightforward as it sounds, or it's very error-prone. Unless there is a tool and well-defined documentation I am reluctant about the top-level go.mod. Some of the go commands including go tidy
or go test
don't seem to work until the tags are pushed to the upstream repo. That doesn't help.
In this specific example, the top module is still v0 - without any backwards-compatibility guarantee. Is that the reason that this is not considered as a breaking change? Or is it because v0 and v1 are indistinguishable w.r.t. import paths?
That's the one, yeah: as long as the v0 and v1 modules have a dependency cycle, we can ensure that the user still gets exactly one copy of any given package.
Once the module reaches v1+, removing a package can be like removing exported names
Right: you can't remove a package that way, you can only move it back and forth across the submodule boundary.
The cyclic dependency decouples the versioning a bit. The submodule only needs to require
the first version of the parent module that doesn't contain the submodule contents, and the parent module only needs to require
the first version of the submodule that does, so after that the versions can be developed more-or-less independently.
Some of the go commands including
go tidy
orgo test
don't seem to work until the tags are pushed to the upstream repo.
Note that that problem is independent of whether the submodules are defined from the start or factored out later. It's hard to test multi-module changes period. (That's part of why I'm leaning toward fewer modules: if we fix the multi-module testing problem in general, we'll presumably also fix it for the factoring-out-submodules use-case.)
Change https://golang.org/cl/139319 mentions this issue: cmd/gorename: set GO111MODULE=off in gorename_test
Change https://golang.org/cl/139320 mentions this issue: go/analysis/analysistest: set GO111MODULE=off in TestTheTest
Given https://github.com/golang/go/issues/27858#issuecomment-424464970, I now believe that we should define x/tools
as a single module for now. That is already the status quo, so there is essentially no harm in codifying it in a go.mod
.
We can reconsider the module boundaries if and when we want to tag something in x/tools
at a stable major version (v1
or higher).
Given #27858 (comment), I now believe that we should define x/tools as a single module for now. That is already the status quo, so there is essentially no harm in codifying it in a go.mod.
@bcmills is this something you/we can do today? 😄
In
./go.mod
, addrequire golang.org/x/tools/cmd/godoc v1.0.0
- That ensures that for any module with
require golang.org/x/tools v0.M.0
(forM
≤N
), runninggo get -u
on that module doesn't drop any packages.
This seems like a compensation for a flaw in the tool. After a go get -u
, the tool should see if any packages required by the main module are no longer in a module, and fetch those packages' modules in the usual way.
Some of the go commands including
go tidy
orgo test
don't seem to work until the tags are pushed to the upstream repo.
I've filed proposal #28835 for one possible solution.
Hi @bcmills, in https://github.com/golang/go/issues/27858#issuecomment-425089484 you wrote:
Creating a submodule is not a breaking change as long as the module and submodule have a cyclic requirement.
The procedure is: ... Testing that commit is a very hard problem, but actually producing it should be straightforward.
A few comments/questions.
Could you expand slightly on that statement regarding the difficulties in testing? Is it partly due to the tags needing to exist so that they can be referenced in the go.mod
files in what you are describing as single commit? If so, I was wondering if pre-release semver tags might help. That might mean you could break it apart into a commit that uses pre-release tags in the go.mod
files where that commit is itself tagged with a pre-release tag. If that would work, it could mean you have an opportunity to test it while a consumer would not pick your tentative changes via go get -u
or go get foo@latest
or similar. Once validated, a normal semver release tag could be applied (so that someone would pick the changes up via go get -u
or similar). If desired, you might be able to do a later step to update the go.mod
files to also use normal semver release tags, but at that point you will have validated that basics worked with the pre-release tags. Or maybe that falls apart somewhere? I'll confess I have not tried that.
I think for modules in general, there are a few "first principles" that frequently mean someone can predict the behavior of the system in more complex scenarios. When it comes to moving/adding/removing one or more go.mod
files in a repo, though, it seems harder to predict from the outside what works vs. what breaks. There are multiple nuances, but one thing in particular that causes me to stumble here is it seems go get
and friends are walking at least two different dimensions when looking for modules that satisfy an import path: (a) the components of the paths (import paths or module paths) such as you/foo/bar
, you/foo
, you
, but also (b) VCS history. Setting aside for a moment what order it gathers information, when it is time to make a decision and there are changes across history, it is not obvious what order it walks those dimensions, and from the outside it seems go get
and friends could make different implementation choices about the order of walking those dimensions.
Setting aside any particular solution for a moment, is there a thought to try to favor the "break apart a single module" use case over what might be more sophisticated use cases like "move multiple modules around within a single repo" or "merge modules within a single repo"? It might be good if most projects adopting modules are able to follow something like "the simplest way to start is a single module per repo, and it is not too hard to later break apart a single repo with a single module into multiple modules in the future if needed".
When it comes to ambiguous imports due to changes across history, I do not know if this is a viable solution, but would it be possible to break the ambiguity in some deterministic way? If needed, perhaps there could be a conscious choice to do so in a way that favors one use case over another use case?
It seems there might be some tooling changes in the future in this area, but given it might take some time for Go 1.11 to sufficiently age out of the ecosystem, it is probably still interesting for there to be workarounds/solutions that work with Go 1.11 as it stands today ( (e.g., maybe something like https://github.com/golang/go/issues/27899#issuecomment-432663636 , or some better idea).
Apologies if some (or all) of this is too vague, or simply off base...
@thepudds
- Could you expand slightly on that statement regarding the difficulties in testing? Is it partly due to the tags needing to exist so that they can be referenced in the
go.mod
files in what you are describing as single commit?
Yes, exactly. (That's what the go mod pack
subcommand proposed in #28835 would address.)
If so, I was wondering if pre-release semver tags might help.
I don't think they do: you could certainly try out the structure with pre-release tags, but at some point you still have to replace them with release tags in the go.mod
files, and you're back to having no way to test the “replace them with release tags” commit.
- […] Setting aside for a moment what order it gathers information, when it is time to make a decision and there are changes across history, it is not obvious what order it walks those dimensions,
Indeed. That's what leads to the practice of adding cyclic requirements when refactoring module boundaries: if the requirements form a cycle, then the path scan produces the same set of results no matter where you start, so the decision collapses to just one dimension (versions).
- […] is there a thought to try to favor the "break apart a single module" use case over what might be more sophisticated use cases like "move multiple modules around within a single repo" or "merge modules within a single repo"?
I don't think we need to favor one use-case over the others: for example, I would expect merges of v0
modules with v1
modules to occur about as often as splits between v0
and v1
.
- When it comes to ambiguous imports due to changes across history, I do not know if this is a viable solution, but would it be possible to break the ambiguity in some deterministic way? […]
That seems like the complement of @jba's suggestion above (https://github.com/golang/go/issues/27858#issuecomment-434713751). It's certainly worth considering.
In light of #29935, I think we should go with the original two-module proposal. cmd/tip
doesn't really belong in x/tools
in the first place, and since nobody should be importing from cmd
anyway #27899 shouldn't matter.
(Perhaps naive question) Can we move cmd/tip out of tools to avoid needing to create a submodule in tools? I'm thinking we should be moving a couple of other things too.
Maybe it should've lived elsewhere to begin with but I'd prefer not to mess with its git history at this point by moving it. Is there something gross about making cmd/tip its own module within x/tools?
We've found (on the tools team) that multi-module repositories lead to a lot of user confusion and we'd prefer to completely avoid them.
Can't we relatively easily make a new repo and copy the version history of the cmd/tip directory to that new repo?
We've found (on the tools team) that multi-module repositories lead to a lot of user confusion and we'd prefer to completely avoid them.
That seems like a pretty severe reaction to confusion. No learning opportunity there instead?
Can't we relatively easily make a new repo and copy the version history of the cmd/tip directory to that new repo?
Now we're making a whole new repo too? Not just moving to x/build?
Relative to what? If we wanted to keep git revisions hashes unchanged we'd need to include a full copy of x/tool's git history (+ size) in the new repo. It's possible, but relative to adding a go.mod file to x/tools, I don't think it's easier.
I agree that we should avoid splitting user-facing packages across modules when possible, and should avoid splitting out modules unless there is a clear need.
However, in this case, cmd/tip
is not user-facing, and splitting it out provides a pretty clear benefit of cutting out a large tree of go.mod
files for transitive dependencies.
(And bear in mind that many the complexities of multi-module repos, such as difficulty editing multiple modules in concert, also apply to modules stored in separate repos.)
That seems like a pretty severe reaction to confusion. No learning opportunity there instead?
Our repos provide an example to users: From everything I've seen, multi-module repositories should be avoided whenever possible, and I don't think it's something we would want our users learning from.
Now we're making a whole new repo too? Not just moving to x/build?
I don't know whether we should put tip in build or its own repo, but I'm fairly convinced it should be moved out. I'd like to better understand the problems with doing so: what's the benefit of keeping git revision hashes unchanged?
(And bear in mind that many the complexities of multi-module repos, such as difficulty editing multiple modules in concert, also apply to modules stored in separate repos.)
I agree, but I think it's a much easier to understand the implications of editing multiple modules in concert compared to modules in submodules.
My understanding is that it was in scope of #29206 to move cmd/tip
out of x/tools
, is it not? I said that in https://github.com/golang/go/issues/29935#issuecomment-457605255.
Making its history more difficult (depending on what tools one uses) to follow is unfortunate, but I don't see it as a blocker. The website code is already being moved, and cmd/tip
is a small part of the overall website-related code.
Edit: I am in agreement with striving to not introduce multi-modules in x/tools
just because of cmd/tip
. I think "user confusion" is not a complete description of the costs involved in making a repository multi-module. The costs are hard to describe, and I'm happy to elaborate if asked, but I think it's a comparatively smaller cost to move cmd/tip
as needed in order to aid x/tools
to be a single tools-related module.
We've moved cmd/tip to x/build, and plan to move cmd/godoc to x/website. Once cmd/godoc is moved, there won't be any dependencies outside of x/net and x/crypto.
Now that the question of cmd/tip is gone, and after the above changes are mode, we should be able to turn x/tools into a single-module repository. We can hold off on tagging versions for now.
I'm looking at the package structure of the
x/tools
repo in response to some comments from @ianthehat and an in-depth experience report from @hyangah.The most extensive dependencies in
x/tools
come fromx/tools/cmd/tip
(via a dependency ongolang.org/x/build/autocertcache
), and are only needed when theautocert
build tag is set.Therefore, I propose that we split 'x/tools' into two modules: one at the repo root, and one at
cmd/tip
, with areplace
directive from the latter to the former.That gives the following structure.
**Edit**: I think `x/tools` should be a single module for now. See https://github.com/golang/go/issues/27858#issuecomment-428973365.**Edit 2**: In light of #29935, I'm back to wanting to keep `cmd/tip` as a separate module. **Edit 3 by @dmitshur:** `cmd/tip` has moved out from x/tools as of [CL 160817](https://golang.org/cl/160817). This is compatible with it being a separate module from x/tools, as @bcmills suggested in edit 2.