cmd/go: add a workspace mode

golang / go

The Go programming language

https://go.dev

BSD 3-Clause "New" or "Revised" License

122.85k stars 17.52k forks source link

cmd/go: add a workspace mode #45713

Closed matloob closed 2 years ago

matloob commented 3 years ago

Detailed Design Doc: Proposal: Multi-Module Workspaces in cmd/go

Demo Video: Go Workspaces Proposal Demo

High level overview:

I propose adding a new workspace mode in the go command for editing multiple modules. The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode. The go.work file specifies a set of local modules that comprise a workspace. When invoked in workspace mode, the go command will always select these modules and a consistent set of dependencies.

This is intended to help with workflows making changes across multiple modules and with editor support for those workflows.

This is what an example go.work file would look like:

go 1.17

directory (
    ./baz // foo.org/bar/baz
    ./tools // golang.org/x/tools
)

replace golang.org/x/net => example.com/fork/net v1.4.5

This adds two modules to the workspace. If the user's current working directory is is under the directory containing this go.work, the go command will be in workspace mode and use both the modules defined by ./baz/go.mod and ./tools/go.mod as main modules, regardless of which module the user is currently in (unless workspace mode is turned off or a different workspace is chosen with the proposed new -workfile flag). The replace would override any replaces in the main modules' go.mod files.

Related issues

32394 x/tools/gopls: support multi-module workspaces

Issue #32394 is about gopls' support for multi-module workspaces. gopls currently allows users to provide a "workspace root" which is a directory it searches for go.mod files to build a supermodule from. Alternatively, users can create a gopls.mod file in their workspace root that gopls will use as its supermodule. This proposal creates a concept of a workspace that is similar to that gopls that is understood by the go command so that users can have a consistent configuration across their editor and direct invocations of the go command.

44347 proposal: cmd/go: support local experiments with interdependent modules; then retire GOPATH

Issue #44347 proposes adding a GOTINKER mode to the go command. Under the proposal, if GOTINKER is set to a directory, the go command will resolve import paths and dependencies in modules by looking first in a GOPATH-structured tree under the GOTINKER directory before looking at the module cache. This would allow users who want to have a GOPATH like workflow to build a GOPATH at GOTINKER, but still resolve most of their dependencies (those not in the GOTINKER tree) using the standard module resolution system. It also provides for a multi-module workflow for users who put their modules under GOTINKER and work in those modules.

This proposal also tries to provide some aspects of the GOPATH workflow and to help with multi-module workflows. A user could put the modules that they would put under GOTINKER in that proposal into their go.work files to get a similar experience to the one they'd get under the GOTINKER proposal. A major difference between the proposals is that in GOTINKER modules would be found by their paths under the GOTINKER tree instead of being explicitly listed in the go.work file. But both proposals provide for a set of replaced module directories that take precedence over the module versions that would normally be resolved by MVS, when working in any of those modules.

26640 cmd/go: allow go.mod.local to contain replace/exclude lines

The issue of maintaining user-specific replaces in go.mod files was brought up in #26640. It proposes an alternative go.mod.local file so that local changes to the go.mod file could be made adding replaces without needing to risk local changes being committed in go.mod itself. The go.work file provides users a place to put many of the local changes that would be put in the proposed go.mod.local file.

mvdan commented 3 years ago

Thanks for working on this, Michael! It's clear that many people want a solution to multi-module development, given all the activity in various issues over the last couple of years.

For those who just arrived here, note that this was discussed over the previous week on the golang-tools list, too. Some points raised there resulted in edits to the proposal draft.

Overall, I really like the idea. Below are some thoughts. As a summary, I think the proposal is trying to fix a few too many issues at once.

When doing a build operation under workspace mode the go command will try to find a go.mod file. If a go.mod file is found, its containing directory must be declared with a directory directive in the go.work file.

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

go.work files allow users to operate in directories outside of any modules but still use the workspace build list. This makes it easy for users to have a GOPATH-like user experience by placing a go.work file in their home directory linking their modules together.

We touched on this in the email list, and you added this section, so as a follow-up - I'm still not convinced that the first version of go.work needs to support doing builds outside of any of the modules. Do we have any particular use cases in mind other than "be more like GOPATH"? For example, before modules I'd never run go build all or go test all from the root of my GOPATH, and I similarly do not expect to do it in workspace mode.

And if I want to build or test one of the modules, I'd want to cd into that module anyway, to make sure I'm using its go.mod file - replace directives, etc.

39005 proposal: cmd/go: introduce a build configurations file

This issue proposes to add a mechanism to specify configurations for builds, such as build tags. This might be something that can be added in a future extension of go.work files.

I don't think that would be a good idea, to be honest. I'm starting to believe that build configurations (e.g. what build tags to use in go build) aren't only useful at the module level, and by extension not only at the workspace level either. Maybe a Go module is entirely portable by default, but one of its packages explicitly targets js/wasm only. The build config should separate the package. Similarly, when working across many modules, it's reasonable for them to fall under different build configurations.

A "build config" also has a purpose, at least when proposed in #39005: for the original author to announce to the users what are the supported ways to build their packages. Without that information, one can only assume that all build tag and GOOS/GOARCH combinations are possible, which quickly explodes exponentially. Since most users would not publish go.work in a VCS repository, the usefulness of a "build config" to give information to users is lost.

All in all, I don't think this proposal should ever aim to solve the "build config" problem.

The go.work file provides users a place to put many of the local changes that would be put in the proposed go.mod.local file.

This one is an interesting thought. This only works if we declare that go.work files should not be committed into VCS, though, and the proposal currently does not say that.

For example, if we're OK with monorepos publishing go.work files, how can that work as a place to put local changes?

ohir commented 3 years ago

Note: comments are related to the proposal document. Please take all "must", "should", "*-not" placed below as prepended with a humble IMO.)

If -workfile is set to off, workspace mode will be disabled. If it is auto (the default), workspace mode will be enabled if a file named go.work is found in the current directory (or any of its parent directories), and disabled otherwise. If -workfile names a path to an existing file that ends in .work, workspace mode will be enabled.

Workspace mode should NOT be turned on without a human demanding so. Proposed defaults with "auto" and go.work presence make for traps if there is go.work present in vcs but we want to build/test in module mode — we may forgot to add a flag "workspace=off" using tools by hand. Or there was an uncommited go.work we forgot to copy, or we checked out older revision, then we think we continue within workspace while we now do not. Former trap might bite the most when we'd set to work deep into the tree but go.work still hangs two floors above us: we might think we're testing this module in module mode (oh, no go.work here) while we are testing this module with dependencies replaced by an active go.work directive.

Make intent explicit either by mandating option to always be present. or by environment. Allow user to decide on "by presence" default eg. with GOWORK="."

The directory directive takes an absolute or relative path to a directory containing a go.mod file as an argument.

The go.work file should not take absolute path in the directory directive — only paths relative to the place of the go.work file. Otherwise having go.work commited to the repository freezes filesystem layout at a go.work mandated state, as "no restriction on where the directory is located" is proposed too. ("No restriction, anywhere" in practice says that after a longer while of hiatus user might be missing pieces of her workspace; with no friendly go get to rescue.)

The replace directives in the go.work are applied in addition to and with higher precedence than replaces in the workspace modules. A replace directive in the go.work file overrides replace directives in workspace modules applying to the same module or module version. If two or more workspace modules replace the same module or module version with different module versions or directories, and there is not an overriding replace in the go.work file, the go command will report an error.

To my understanding, the ultimate goal of "workspace" is to produce versioned code for reproductible builds, hence at least the unversioned replacements should be disallowed in go.mods of modules we currently work on in the "workspace". Allowing for two or more sets of replacements overlayed would be a costly mistake.

IOW: either we commit to progress in work with set of go.mod replacements (current state), or we commit to work using workspaces and put all our replacements in go.work. Tools should guard us from the mix.

In my imagination, the "workspace" workflow (ie. workspace=on) should be a simple loop:

make code work;
test locally; if test != ok {; goto 1 };
tag changes and fix go.mods accordingly
push tagged changes upstream for CI to consume from modules proxy/product repo;

After 4 the CI pipeline should use the same code we tagged at 3!

If code to be built with workspace "=on" and "=off" does not converge at 4, we've lost the track of what we're shipping. Tools should somehow prevent that. A bare minimum is to disallow unversioned replacements in go.mods of modules pulled into workspace. The real shield would be to move all replacements from go.mod to go.work — hardly possible now.

matloob commented 3 years ago

@mvdan, Thanks for the feedback! A few replies:

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

I'm not convinced that the right behavior is to add all the modules under the directory, even if it were free: if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories. So if we added all the modules in a directory we'd have to add another mechanism to remove modules from the workspace. And if we're going to do that it seems to me that the default should be to add a module at a time, and maybe consider if we want to do a directory tree in the future, if it turns out to be necessary.

I think the name of the directory directive is unfortunately misleading, because it does seem to imply that we're adding all the modules under a directory. But it's the best name I could come up with: unfortunately module is already taken by the go.mod file to have a different meaning.

We touched on this in the email list, and you added this section, so as a follow-up - I'm still not convinced that the first version of go.work needs to support doing builds outside of any of the modules. Do we have any particular use cases in mind other than "be more like GOPATH"? For example, before modules I'd never run go build all or go test all from the root of my GOPATH, and I similarly do not expect to do it in workspace mode.

And if I want to build or test one of the modules, I'd want to cd into that module anyway, to make sure I'm using its go.mod file - replace directives, etc.

I'm wondering what we'd get by disabling builds outside any of the modules? It's something that essentially comes for 'free' from the rest of the design because the go.work file already has to explicitly specify its set of directories, and the build list does not depend on the 'current' module. And keeping this in reinforces the point that if you are in the same workspace cding into the module doesn't change go.mod file or the set of replace directives that are used

I don't think that would be a good idea, to be honest. I'm starting to believe that build configurations (e.g. what build tags to use in go build) aren't only useful at the module level, and by extension not only at the workspace level either. Maybe a Go module is entirely portable by default, but one of its packages explicitly targets js/wasm only. The build config should separate the package. Similarly, when working across many modules, it's reasonable for them to fall under different build configurations.

A "build config" also has a purpose, at least when proposed in #39005: for the original author to announce to the users what are the supported ways to build their packages. Without that information, one can only assume that all build tag and GOOS/GOARCH combinations are possible, which quickly explodes exponentially. Since most users would not publish go.work in a VCS repository, the usefulness of a "build config" to give information to users is lost.

All in all, I don't think this proposal should ever aim to solve the "build config" problem.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

This one is an interesting thought. This only works if we declare that go.work files should not be committed into VCS, though, and the proposal currently does not say that.

For example, if we're OK with monorepos publishing go.work files, how can that work as a place to put local changes?

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

matloob commented 3 years ago

@ohir, thanks for the comments!

Workspace mode should NOT be turned on without a human demanding so.

I definitely understand the issue about users not realizing which workspace they're in and accidentally ending up with the wrong build. I think this is most likely to happen if the workspace file is checked into a repository. (Otherwise, the user created the workspace file themselves, and opted-in to workspace mode). Would you agree with that?

I think in general checking in a workspace file to a repository should be rare: because it is really only useful for multi-module repositories which themselves are not the common case, and because it will affect the build configuration for every developer of a module, it's something that should only be done if it fits into the workflow of most developers.

The go.work file should not take absolute path in the directory directive — only paths relative to the place of the go.work file.

Do you think this would be a problem for go.work files that are not checked into a repository? Of course, I agree that absolute paths should not show up in a go.work file that's checked into a repository (for that matter, there shouldn't be any relative paths that point outside the repository). If such a change was committed, I expect that it would be quickly reverted because it likely wouldn't work for any other developers than the one who originally committed it!

Here is a case for absolute paths: some editors might want to make temporary workspace files because a user just opened two files in different modules. In that case the workspace file would be put in a temporary directory, and it would make more sense to add absolute paths rather than trying to construct a relative path from the temporary directory to the location of the module. And on Windows, I believe it's not possible to construct relative paths across volumes, so the editor would have to attempt to make a temporary directory on the same volume, something that might not be easy to do.

I wonder if this is something an outside tool can catch, like a sort of workspace vet command?

To my understanding, the ultimate goal of "workspace" is to produce versioned code for reproductible builds, hence at least the unversioned replacements should be disallowed in go.mods of modules we currently work on in the "workspace". Allowing for two or more sets of replacements overlayed would be a costly mistake.

Oh! I think the doc must have not been as clear: the ultimate goal is not to produced versioned code for reproducible builds, but to assist in workflows on multiple modules. To get the reproducible build you still need a single go.mod file that a user will build from. I wonder if you have ideas for how to make the intention more clear in the document.

Because replacements are made for developing modules, I think users would be surprised to not have them in workspaces. I think it's a bit unfortunate that they need to be supported in workspaces, but it's hard for me to see a better solution.

If code to be built with workspace "=on" and "=off" does not converge at 4, we've lost the track of what we're shipping. Tools should somehow prevent that. A bare minimum is to disallow unversioned replacements in go.mods of modules pulled into workspace. The real shield would be to move all replacements from go.mod to go.work — hardly possible now.

I think having replaces in multiple places is a good idea, but (unless I'm misunderstanding) this seems to encourage users to check in go.work files in single module workspaces, which could impede users setting up their own workspaces (see my reply to @mvdan )

mvdan commented 3 years ago

if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories

I see. I can't say I can think of any scenario where I'd need this, but I've also avoided nested Go modules at any cost.

it's the best name I could come up with

How about include, import, or just modules?

I'm wondering what we'd get by disabling builds outside any of the modules?

I'm looking at this from the other side; why should we support this edge case right away if we don't have a strong use case for it? Doing builds inside the workspace but outside any module is something we could add at a later time if users really ask for it, but we wouldn't be able to take it back if we later find out it's confusing or unnecessary.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

I'd remove that section, personally. Nowadays I tend to think that build config shouldn't need to be attached at the module level, and the same applies to the workspace level.

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

I think making that recommendation clearer is a good idea, as it's probably one of the first questions users will have.

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

Hmm, I find that recommendation for users to be a tad weird. We're introducing nested workspaces and builds outside any module, all for the sake of being able to satisfy the use case of local-only replace directives. It feels like a bit of a stretch :)

neild commented 3 years ago

The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode.

Does this imply that every go command invocation which reads a go.mod needs to check for go.work in every directory up to the fs root?

I’ve worked on systems with slow network filesystems where that would be expensive.

matloob commented 3 years ago

it's the best name I could come up with

How about include, import, or just modules?

include could work. Added it as an option in the "Open issues" section

I'm wondering what we'd get by disabling builds outside any of the modules?

I'm looking at this from the other side; why should we support this edge case right away if we don't have a strong use case for it? Doing builds inside the workspace but outside any module is something we could add at a later time if users really ask for it, but we wouldn't be able to take it back if we later find out it's confusing or unnecessary.

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list". I think we'd need to have a stronger reason to add the extra case of making an extra rule to disable it, for example cases that it could become confusing for.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

I'd remove that section, personally. Nowadays I tend to think that build config shouldn't need to be attached at the module level, and the same applies to the workspace level.

Got it. I'll start by make it it much more clear that we have no intention of addressing that use case in this proposal or extending go.work in the future to address it.

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

I think making that recommendation clearer is a good idea, as it's probably one of the first questions users will have.

Done, in the latest revision, and strengthened it to advocate against checking in go.work in vcs period.

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

Hmm, I find that recommendation for users to be a tad weird. We're introducing nested workspaces and builds outside any module, all for the sake of being able to satisfy the use case of local-only replace directives. It feels like a bit of a stretch :)

Thinking about this much more in light of your comments and @ohir's I think recommending go.work files to be checked in to repos was a mistake so I'm going back and uniformly recommending against it.

meling commented 3 years ago

Caveat: I've skimmed the proposal but haven't really tried to understand all the nuances here, so this may well be too simplistic.

My suggestion would be to try really, really hard to do this without introducing the go.work file. Here is an idea that maybe could work, using only go.mod with some minor additions.

Here workspace is a magic module name that the go command can interpret in a similar way to the go.work file.

module workspace

go 1.18

require (
    ./baz   foo.org/bar/baz
    ./tools golang.org/x/tools
)

If the go.mod file for some module has require entries with a local path, these would be used; I guess this would be similar to a replace entry.

module github.com/relab/gorums

go 1.18

require (
    ./baz   foo.org/bar/baz
    ./tools golang.org/x/tools
    google.golang.org/grpc v1.36.1
    google.golang.org/protobuf v1.26.0
)

As I mentioned at the top, I don't know if this would work for all scenarios, but I would strongly encourage finding a simple design.

Obviously, I understand adding another field to the require entries makes parsing a bit more challenging. Still, I think there are reasonable solutions to this, e.g., paths must start with ./ or ../ or /. Alternatively, one could add a => between the path and the module name, if necessary.

matloob commented 3 years ago

The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode.

Does this imply that every go command invocation which reads a go.mod needs to check for go.work in every directory up to the fs root?

I’ve worked on systems with slow network filesystems where that would be expensive.

I want to understand this a bit better: my assumption for these cases are that most builds (or lists other go command operations that need the info in a go.mod will need to open many files and directories, and the lookups done for go.work wouldn't dominate. Is that not the case? Are there operations or filesystem layouts that would be particularly bad here?

neild commented 3 years ago

I want to understand this a bit better: my assumption for these cases are that most builds (or lists other go command operations that need the info in a go.mod will need to open many files and directories, and the lookups done for go.work wouldn't dominate. Is that not the case? Are there operations or filesystem layouts that would be particularly bad here?

I've been on systems where stats on files under $HOME were reasonably fast, but $HOME/.. was slow due to being a network automount. I recall AFS as being one offender here, although it's been quite a while and perhaps I'm unfairly maligning it.

Perhaps this isn't a case that matters any more these days; multiuser systems are certainly out of vogue.

matloob commented 3 years ago

@meling I'd like to understand better why you'd like to avoid introducing the go.work file? I think making a kind of magic go.mod file instead could be more confusing and make things more complicated.

I don't totally understand the second example: what does it mean when you have both types of requires in a 'regular' module?

mvdan commented 3 years ago

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list".

Intuitively, that's now how I thought about it. Should workspace mode really kick in if there's a go.work file somewhere in the parent directory chain, but there are absolutely no go.mod files anywhere on the filesystem?

If you think about go.work as a replacement to GOPATH, then it seems reasonable to want to use workspace mode outside any module. But if you think of it as a way to develop multiple modules at once, then it still doesn't make sense to me. From my point of view, the special case is to allow this extra mode of operation, not to disallow it :)

strengthened it to advocate against checking in go.work in vcs period.

Just thinking outloud: this does make the story for monorepos worse, because in those cases it can be easier for developers to share a go.work file by committing it at the root of the VCS repository. It seems like there might be intended use cases here which are at odds with each other.

jimmyfrasche commented 3 years ago

Maybe there could be an env var that let's you specify a list of directories and the go command treats those directories as the root when performing searches up the directory tree for go.work or go.mod or anything else that comes up in the future. It seems like a general setting and might deserve its own thread, even if this proposal is what would cause it to be needed more than it is now.

matloob commented 3 years ago

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list".

Intuitively, that's now how I thought about it. Should workspace mode really kick in if there's a go.work file somewhere in the parent directory chain, but there are absolutely no go.mod files anywhere on the filesystem?

If there are no go.mod files on the filesystem, then the workspace must be empty or invalid (because every directory directive must point to a directory with a go.mod file). If that's the case, the only operations that make sense from the go command are those that apply "globally", such as go install module@version.

More generally: the rules are similar to what go.mod does. There could be alternative ways to decide how to enter a "single build list for multiple local modules" mode, but it doesn't seem necessary for such a mechanism to only work when the current directory is contained in one of the component modules.

If you think about go.work as a replacement to GOPATH, then it seems reasonable to want to use workspace mode outside any module. But if you think of it as a way to develop multiple modules at once, then it still doesn't make sense to me. From my point of view, the special case is to allow this extra mode of operation, not to disallow it :)

It's true that multiple modules can still be developed at once even if we add the restriction, so that it doesn't solely address the problem of developing multiple modules at once. But it's a smaller set of rules for a user to remember (when I'm under this go.work file, rather than, when I'm under this go.work file, and contained in one if its modules). I think a lot of users would think of it as an arbitrary restriction if they got the error message saying that it's not supported outside of a module's directory.

I'm interested in what cases this could lead to user confusion or other errors that we should be worried about.

strengthened it to advocate against checking in go.work in vcs period.

Just thinking outloud: this does make the story for monorepos worse, because in those cases it can be easier for developers to share a go.work file by committing it at the root of the VCS repository. It seems like there might be intended use cases here which are at odds with each other.

One future option could be to make an external tool that produces a go.work file listing all the modules recursively in a directory to easily link together a repo.

matloob commented 3 years ago

Maybe there could be an env var that let's you specify a list of directories and the go command treats those directories as the root when performing searches up the directory tree for go.work or go.mod or anything else that comes up in the future. It seems like a general setting and might deserve its own thread, even if this proposal is what would cause it to be needed more than it is now.

Yeah I think that would be a reasonable option if that ends up being a problem, though I hope that this proposal isn't the reason why we would need it.

myitcv commented 3 years ago

so I'm going back and uniformly recommending against it.

I'm somewhat confused 😄 . Because it was my understanding that the work flow of being able to check in a go.work file that would then be shared amongst developers of a multi-module repo, e.g. x/tools was a fundamental use case we were looking to cover here. Are we now saying that's not a valid use case? And that instead, each developer needs to create such a file themselves?

I might well be alone, but I've somewhat lost track of the original use cases driving this proposal, and therefore the extent to which this proposal satisfies those problems/concerns. In particular, the UX before and after, who is expected to create go.work files in a given use case and when, is the go.work file added as the workspace to the editor, what further in-editor configuration is necessary, when/where -workfile needs to be specified on the command line etc.

matloob commented 3 years ago

Because it was my understanding that the work flow of being able to check in a go.work file that would then be shared amongst developers of a multi-module repo, e.g. x/tools was a fundamental use case we were looking to cover here.

To confirm, this is is the workflow of someone wanting to work across modules in a multi module workspace, right? If so, the proposal definitely aims to address this proposal. We could create a tool to create go.work from the modules recursively contained in a directory. Like a more robust version of this:

#!/bin/sh
find $1 | grep "\.mod$" | awk '
        BEGIN { print "go 1.17\n\ndirectory (" }
  !/testdata/ { printf "\t./%s\n",substr($1,0,length($1)-length("/go.mod")) }
          END { print ")" }
'

Then someone working in a module like tools would run multimodulework tools when they clone the tools repo and they'd have a go.work file, outside the repo, that links together the tools and gopls modules. Would that address the use case?

myitcv commented 3 years ago

We could create a tool to create go.work from the modules recursively contained in a directory.

Indeed, but that impacts the overall UX of the solution. I'm not arguing for/against such a tool, but to my mind the ultimate workflow needs to be clear to anyone reading this proposal. Not least because this is (I think?) the first mention of such a tool, and I assume it's beyond the scope of this proposal?

Would that address the use case?

I think this is a question for those folks who have to deal with this workflow. Is the fact that every developer now has to create a go.work file themselves using a tool they need to install, updating that file periodically, a less painful approach than the current workflow?

meling commented 3 years ago

@meling I'd like to understand better why you'd like to avoid introducing the go.work file? I think making a kind of magic go.mod file instead could be more confusing and make things more complicated.

I prefer fewer file types in my folders, and having to examine two different file types for dependencies and replace directives, etc seems unnecessary, when there is already a file type for the Go module system.

As for the "magic" workspace go.mod file, maybe my suggestion was too magic since it would change the interpretation of the require entries. An alternative format could instead be:

module workspace

go 1.18

workspace (
    ./baz   foo.org/bar/baz
    ./tools golang.org/x/tools
)

Obviously, the workspace keyword isn't important; it can be something else, e.g. modules as proposed above.

In fact, an alternative format could reuse the replace directive:

module workspace

go 1.18

replace (
    foo.org/bar/baz => ./baz
    golang.org/x/tools => ./tools
)

PS: I'm assuming that a workspace module can be specified without any require entries in "workspace" mode.

I don't totally understand the second example: what does it mean when you have both types of requires in a 'regular' module?

My idea was that entries with path names could serve the same role as replace directives; you wouldn't need to specify replace directives if you specified a path name in the require directive. It isn't important since there is already a way to do it with the replace directive, but as you point out in the proposal, working with replace directives can often be awkward.

matloob commented 3 years ago

@myitcv Got it- I've updated the doc to mention that an external tool can potentially help with the creation of go.work files for multi module repos.

@meling A go.work file has fundamentally different semantics than a go.mod file. A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

rsc commented 3 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rhcarvalho commented 3 years ago

@matloob thanks for fleshing out the proposal.

https://github.com/golang/go/issues/45713#issue-865531565

The replace would override any replaces in the main modules' go.work files.

Wondering if this might be a typo, I didn't find clarification in the other comments -- did you mean "override any replaces in the main modules' go.mod files"?

matloob commented 3 years ago

@rhcarvalho Ooh yeah it's absolutely a typo! Edited the comment to fix! Thanks!

meling commented 3 years ago

@meling A go.work file has fundamentally different semantics than a go.mod file. A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

Well, I have to disagree that the semantics between go.work and go.mod are fundamentally different. They are quite similar in their objectives. Further, it is quite common that a single file type can encode different semantics. For instance, the go.mod file already supports different "semantics," such as replace and exclude.

zevdg commented 3 years ago

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

I'm not convinced that the right behavior is to add all the modules under the directory, even if it were free: if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories. So if we added all the modules in a directory we'd have to add another mechanism to remove modules from the workspace. And if we're going to do that it seems to me that the default should be to add a module at a time, and maybe consider if we want to do a directory tree in the future, if it turns out to be necessary.

In discouraging multi-module projects from checking in go.work, I think you inadvertently made the case for a directory tree feature much stronger. Assuming there is a tool that walks a directory tree and generates a go.work file that includes all its submodules, in order to stay up to date, it would need to be re-run after any sub-module was added or removed. If the go.work file is checked in, then only the person who actually added or removed the module needs to make the change. If it's not checked in, then every contributor would have to re-run the command to update their workspace. To be fair, modules probably aren't added or removed from multi-module repos too often, but if I want a reliable workflow for a multi-module repo, I'd need to defensively re-run the go.work generator command after every pull from upstream and every branch switch just in case someone else added or removed a module in that update or branch.

However, if something like

go 1.17

directory ./...

were valid and added all submodules under go.work to the workspace, then contributors workspaces would include newly added modules without needing to either constantly regenerate OR check in go.work.

To state it more generally, if we do commit to discouraging multi-module projects from checking in go.work, then we'd want to ensure both

A) that it's easy for new contributors to generate initial go.work files and
B) that those go.work files only need to be updated in exceedingly rare cases since project maintainers don't have a good way to signal to all their contributors that their workspaces need to be updated.

If we ignore these 2 problems, then multi-module repo owners will simply ignore our advice to not check in go.work.

Problem A) is easy. As you've said this could be done by an external tool, but it's something that go work init should handle, at least in the common cases IMO. Problem B) is the harder part and fundamentally impossible without a crystal ball. As you noted, some projects may want to specifically exclude certain modules, so although allowing directory ./... insulates users from needing to update their go.work file in the case of the a newly added module that should be included in their workspace, it doesn't insulate them from the case of a newly added module in the repo that shouldn't be included in their workspace. That said, accidentally including a new unnecessary module module in the workspace seems both rarer and less likely to cause problems than not including a new module that contributors would expect to be included. Given that, maybe it's ok to not handle this case cleanly.

Long story short, there's enough incidental complexity arising from the decision to discourage multi-module repos from checking in their go.work that it may be worth revisiting that recommendation.

Disclamer: This opinion is not backed by any hard data. I have a gut feeling that the minimal go.work file above is a special case in that it's exactly what several users seem to want, and it's also the behavior that many multi-module repo owners would encourage their contributors to use. If all the modules in a repo are tightly coupled and released together (which I suspect is one of the main reasons people group multiple modules in the same repository), then it is IMO obviously the most intuitive configuration for working on the project. If my suspicions here are correct, then it would make sense for the generation of that exact go.work (or one that behaves like it) to be a Critical User Journey for this feature, and not something delegated to an external tool.

rsc commented 3 years ago

The ./... is too expensive. It incurs an unbounded large amount of file walking just to get the go command started. I don't need to prove that with measurements. It won't scale, and we want Go tools to scale.

matloob commented 3 years ago

@zevdg I agree with Russ that ./... is too expensive. But I don't think it's necessary either. Of the two issues you listed

A) that it's easy for new contributors to generate initial go.work files and

B) that those go.work files only need to be updated in exceedingly rare cases since project maintainers don't have a good way to signal to all their contributors that their workspaces need to be updated.

We agree that A is relatively easy, so that leaves B.

Splitting out a "submodule" from another module should be a rare operation. It often causes confusion and other workflow issues (for instance, updating the version dependencies between the submodules to correctly pick up changes in the modules). And in the cases in which a module does need to be split out, there are safeguards to prevent confusion. As specified in the design document, the go command will error out when it is run from a module directory if the module is not in the go.work file. That requires the user to explicitly turn off workspace mode or add the module to their workspace. And it would be easy for tooling like gopls to show users a warning if the've opened a file that doesn't belong to the current workspace.

So I think the cases that all module developers will want to update their go.work should be very rare, and when it does happen, we can have safeguards to catch and help users.

hherman1 commented 3 years ago

I've only skimmed the discussion, so my apologies. I just wanted to say that my first impression is that I'm sad I will have to learn about a new go toolchain file that will be often seen in my directories, because it makes it harder for me to understand the whole world. It may still be worth adding, but I think its important to consider the cost. I'm very fond of how simple the go build tools/API is, especially compared to gradle which I spend much more time with, and I hope it stays that way!

matloob commented 3 years ago

@hherman1 Our goal with this proposal is that the build tools continue to stay simple. In general, a go.work file should exist because you as the user created it to define your own workspace. So if you don't want to learn or use workspaces you don't need to!

The proposal recommends against checking in go.work files into repositories. And for the vast majority of Go repositories that only have a single module, there would be no reason for a go.work file to exist at all. So you as a user should not run into any go.work files unless you created them, and your workflows should continue to work as they are.

hherman1 commented 3 years ago

Very glad to hear it! Thanks for acknowledging my concerns

earthboundkid commented 3 years ago

A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

Yes, but a go.mod is already a definition for multiple packages, both internal and external. I'm worried that as soon you introduce go.work, someone will say, "Oh, I need a go.world because I'm juggling multiple go.work files" and so on ad infinitum. What's the limiting principle? Why isn't someone going to need a meta-meta-meta-package manager now that we already have a meta-package manager and once you add a meta-meta-package manager?

spekary commented 3 years ago

Whatever is the result, I vote for making its use be specified on a command line, including its location. These kinds of "automatically find the config file" solutions create havoc with build systems. You eventually have to create, edit or remove files before building which is a big pain, and problematic during development. Its much better to say "use this go.mod file" or even "use this go.work.stage file as the current go.work file" when building so that you can build multiple build flavors with just command line changes, and check in the configuration files that make these builds special.

matloob commented 3 years ago

@carlmjohnson I appreciate the concern, and I certainly don't want a proliferation of many different types of config files either. The limiting principle is that the go.work file is controlled by the individual developer while the go.mod file is distributed with the module. The behavior introduced by go.work seems at once to be both something that doesn't belong in go.mod because of the different semantics and also something that would be very helpful in the day to day of module developers. If we're concerned that additional important use cases will show up that won't be covered by go.work, I think we should think of how go.work could be made to be extensible or to support those use cases now. I still don't understand what those use cases are (for a potential go.world or go.universe etc.) but I'd like to hear them.

@spekary I agree that the "automatically find the config file" can sometimes create havoc with build systems (at least I can see that happening with CI systems). That's part of why we recommend that go.work files are not checked in. That way, users who'd have trouble with automatically found go.work files can choose to not have any go.work files in their directory tree that the go command searches and instead always use the -workfile flag when they want the workspace mode behavior. On the other hand, I think using a flag is not as ergonomic as finding a file for most users, so looking for a file seems to be the right default.

spekary commented 3 years ago

You can "recommend" something, but in a group development setting, mistakes happen. The potential to accidentally check in a file that gets automatically picked up by a build system and then changes the build makes larger organizations understandably nervous. However, also including an overriding -workfile flag that can point to a new work file, or point to no workfile, would solve this problem. That way, the build system could be set up so that it is not effected by accidental changes to the work space configuration file.

matloob commented 3 years ago

@spekary -workfile as specified in the proposal already overrides the go.work file that would be automatically found. If set to off it disables workspace mode outright, and if set to a .work file's path then that file is used to define the workspace. Does that work?

spekary commented 3 years ago

Didn't notice that. That is perfect.

aarzilli commented 3 years ago

What concerns me with this is that it could lead to users developing and publishing code that only works on their machine because of an unrelated go.work file that happens to be in one of the parent directories, a circumstance they would receive no indication of.

myitcv commented 3 years ago

Just following up on a point that I raised on the tools call earlier today. To my mind the way to objectively evaluate this proposal is to:

list the use cases/workflows that will be impacted by this change
- describe the current workflow, in terms of editor and non-editor UX
- explain how that workflow will change under this proposal, in terms of editor and non-editor UX
list any significant use cases/workflows that will not be impacted by this change

That way, everyone can be clear about the basis on any decision is taken, especially with regards to editor and non-editor UX interactions.

As @matloob suggested, a version of cmd/go that people can use, through their editor and on the command line, will help to flush out potential use cases that need to be considered, but also provide concrete feedback about whether a proposed workflow "works" or not.

eliasnaur commented 3 years ago

Like @meling I'm sad to see yet another file type with so much semantic overlap with go.mod, in particular a filetype not meant to be checked in. Like @mvdan I find the manual updating of directory directives to match modules added/removed/changed from the workspace unfortunate.

Like @meling, I tried and failed to squeeze workspace mode into go.mod in the GOPATH mode proposal. Below is another attempt.

It seems to me go.work files provide one main feature: a way to group a set of modules into a workspace to be worked on simultaneously. A counter-proposal for avoiding go.work files altogether is to change the go command to accept multiple go.mod paths for the -modfile flag, where the files specified comprise the workspace as defined in this issue. You won't get workspace scoped replace directives, but it's not clear to me they're important enough to have a separate filetype for.

hherman1 commented 3 years ago

I think the work file would be less offensive to me if it did less. I.e if it was literally just a list of local filepaths, with no special syntax to learn, and no potential to bloat in the future.

This matters for users b.c every new thing they have to learn contributes to the feeling of “go is so complex” bit by bit. That feeling typically comes from a million small complexities rather than a single overblown feature.

complyue commented 3 years ago

A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

Yes, but a go.mod is already a definition for multiple packages, both internal and external. I'm worried that as soon you introduce go.work, someone will say, "Oh, I need a go.world because I'm juggling multiple go.work files" and so on ad infinitum. What's the limiting principle? Why isn't someone going to need a meta-meta-meta-package manager now that we already have a meta-package manager and once you add a meta-meta-package manager?

I'd suggest don't worry about it.

go.mod is the working unit of "package" (as in "package manager") versions wrt release engineering, while Go "package" (as in "Go package" and "Java package") is the modular piece of code shared to all dependents of the particular version of its owning "package" (as in "package manager").

I'd say go.work justifies pretty well for software engineering, for it being:

A set of WIP versions of go.mod's, i.e. "packages" (as in "package manager"), to be worked on locally, including the validation of their inter-compatibility across each others, to detect regressions in other go.mods w.r.t. edits in one go.mod, and to tinker with possible fixes on those issues, before such versions made publicly available.

And I'm not afraid of the emergence of go.world if it justifies as well.

complyue commented 3 years ago

@eliasnaur

A counter-proposal for avoiding go.work files altogether is to change the go command to accept multiple go.mod paths for the -modfile flag, where the files specified comprise the workspace as defined in this issue.

I'm against this (counter) proposal:

if the multi-mod author is going to type out all those -modfile cmdl args every time manually, it goes out of track for principled software engineering workflows, and would increase incidental mistakes, if he/she can actually tolerate the unwieldy types all the time
otherwise, you track those cmd lines in some non-standard build configuration file, likely define some env var for this part of the cmdl options, to be joined together for a final full cmd line. IMHO this is even worse nightmare than go.work

You won't get workspace scoped replace directives, but it's not clear to me they're important enough to have a separate filetype for.

IMHO, you still get "workspace scoped replace" semantics, but in a bad way.

eliasnaur commented 3 years ago

@eliasnaur

A counter-proposal for avoiding go.work files altogether is to change the go command to accept multiple go.mod paths for the -modfile flag, where the files specified comprise the workspace as defined in this issue.

I'm against this (counter) proposal:
* if the multi-mod author is going to type out all those `-modfile` cmdl args every time manually, it goes out of track for principled software engineering workflows, and would increase incidental mistakes, if he/she can actually tolerate the unwieldy types all the time

* otherwise, you track those cmd lines in some non-standard build configuration file, likely define some env var for this part of the cmdl options, to be joined together for a final full cmd line. IMHO this is even worse nightmare than `go.work`

I'm not sure -modfile workspaces would be so bad in the typical case. For me, I'd just have

GOFLAGS=-modfile="go.mod ../gioui.org/go.mod" go run ...

in my shell history, ready to reuse.

Note that I'm not against go.work in itself, but against adding it to the go tool. With -modfile defined workspaces, there's nothing stopping external, even standard, tools to maintain and use a go.work-like configuration file.

complyue commented 3 years ago

GOFLAGS=-modfile="go.mod ../gioui/go.mod" go run ...

in my shell history, ready to reuse.

Personally I don't like this style of hacking, it would add complexity to coordination of team working or open source collaboration. I would do it temporarily for necessary hacks, but will fight it if it finds its way into a principled dev&ops workflow.

Note that I'm not against go.work in itself, but against adding it to the go tool. With -modfile defined workspaces, there's nothing stopping external, even standard, tools to maintain and use a go.work-like configuration file.

I'm afraid -modfile won't work well with gopls. I suggest it's best for all go tooling, including gopls (as #32394 is addressing), to support the workspace semantics in an integrated, coherent way, go.work hopefully will do.

DmitriyMV commented 3 years ago

Personally I don't like this style of hacking, it would add complexity to coordination of team working or open source collaboration. I would do it temporarily for necessary hacks, but will fight it if it finds its way into a principled dev&ops workflow.

We were making\using Makefile's for more than a half of century, so this is pretty much established practice by now. I don't see anything particularly new with such usage.

It seems to me go.work files provide one main feature: a way to group a set of modules into a workspace to be worked on simultaneously. A counter-proposal for avoiding go.work files altogether is to change the go command to accept multiple go.mod paths for the -modfile flag, where the files specified comprise the workspace as defined in this issue. You won't get workspace scoped replace directives, but it's not clear to me they're important enough to have a separate filetype for.

I agree. I think that, even tho gopls is important, it needs doesn't justify existence of additional separate semantic unit. If it can be achieved existing knobs I think it would be good to do so. I answered in more detail on reddit.

complyue commented 3 years ago

We were making\using Makefile's for more than a half of century, so this is pretty much established practice by now. I don't see anything particularly new with such usage.

I'm okay with Makefile, Makefile.in, configure etc. as they are usually properly tracked by SCM, what I'm against is ad-hoc settings of env vars, cmd lines etc. out of tracking by well instructed workflows.

And for my preference of go.work over Makefile or the likes, I'd defend it with how ridiculous it would need to go, in making gopls (with VSCode as the IDE e.g.) support the setup well enough.

DmitriyMV commented 3 years ago

I'm okay with Makefile, Makefile.in, configure etc. as they are usually properly tracked by SCM, what I'm against is ad-hoc settings of env vars, cmd lines etc. out of tracking by well instructed workflows.

We can set GOFLAGS globally using go env -w so I'm unsure about env vars being ad-hoc.

complyue commented 3 years ago

We can set GOFLAGS globally using go env -w

I regard this a perfect reason in many cases "It works on my machine"TM, when others scratch their heads, even after followed the instructions.

I mean anti-reproducibility by ad-hoc.

nikandfor commented 3 years ago

Maybe my experience is not complete enough but the only problem I have working with multiple modules is forgotten/committed/commented replace directives in my go.mod file.

I don't like the idea of some auto detected files, recursive module search in some directory or forcing me to keep my repos in GOPATH fashion.

I believe the simplest (and so better) solution would be

move all workspace replace directives to a separate file (say go.work)
specify all the needed replaces explicit by hands
make enabling that file explicit by hands via flag or env variable
all replaces in that file overrides these in go.mod

So the default behaviour is the same as now: go.mod is used.

If you want to make changes in more than one module you put replaces to go.work and build as go run -workfile=go.work ./cmd/qwe or GOWORK=go.work go run ./cmd/qwe After you done you commit changes in dependent repos, make go get dependency@version and test in reproducible mode as go test ./....

In the end you have clean and beautiful go.mod. You never commit forgotten replaces. You are able to switch between default (reproducible) and workspace modes quickly and easy. You may have multiple workspace configurations and keep them for the next time. It's minimal changes to the go command. And the same could work in editors and tools like gopls (but you need to specify which work file to use, so env var probably fits better).

earthboundkid commented 3 years ago

I don't think "go.world" is an outlandish hypothetical at all. The scenario is clear: a large company has a monorepo. Everyone using the monorepo uses a common go.work to get all of their modules working together. The large company acquires another large company with a monorepo (or the company is large some divisions are not communicating with each other, whatever). Now a worker needs to be able to use both monorepos because they're on the team that's trying to get the self-driving car to show banner ads. Suddenly, they need a go.world to bridge between monorepos.

Now, you can say, the company "shouldn't" have checked the go.work into the monorepos, but I don't really see why not or how you could stop them from doing it.

I much prefer a solution in which the only meta-package manager is go.mod, and if necessary, you can specify multiple go.mod files that stack in some way. You will still have problems when bridging monorepos, but at least the solution is clear (slap on another go.mod file in your build script) instead of coming back to the Go team and asking for a third level of indirection to be added to the go tool.