golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.46k stars 17.59k forks source link

cmd/go: do not allow the main module to replace (to or from) itself #34417

Open rselph-tibco opened 5 years ago

rselph-tibco commented 5 years ago

What version of Go are you using (go version)?

$ go version
go version go1.13 darwin/amd64

This problem is also seen on Linux using go1.12.3

$ go version
go version go1.12.3 linux/amd64

And on Mac with the tip branch

$ gotip version
go version devel +fe2ed50 Thu Sep 19 16:26:58 2019 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/rselph/Library/Caches/go-build"
GOENV="/Users/rselph/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/rselph/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/local/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/local/lib/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="/usr/bin/clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4w/6dmz_gmd6fj_qy_nwdwclg8c0000gp/T/go-build675848548=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.13 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.13
uname -v: Darwin Kernel Version 18.7.0: Tue Aug 20 16:57:14 PDT 2019; root:xnu-4903.271.2~2/RELEASE_X86_64
ProductName:    Mac OS X
ProductVersion: 10.14.6
BuildVersion:   18G95
lldb --version: lldb-1001.0.13.3
  Swift-5.0

What did you do?

Option A:

$ git clone https://github.com/rselph-tibco/go-unstable-mods.git
$ cd go-unstable-mods
$ git checkout start_here
$ git switch -c new_branch
$ ./run.sh

The git repository will be updated with the results of each step. Optionally, comment out the git commands to simply produce the error without recording results along the way.

This is equivalent to:

  1. Start with the contents of the attached file go-unstable-mods-start_here.tar.gz
  2. Set GOPATH and GOCACHE to point at empty directories
  3. From the sample1 directory run go mod tidy
  4. From the sample2 directory run go mod tidy
  5. From the sample2 directory run go install ./...
  6. From the sample1 directory run go install ./...
  7. Repeat the last step indefinitely

At this point, sample1/go.mod will never stabilize.

What did you expect to see?

go.mod should stabilize when the build is given the same inputs over and over.

What did you see instead?

go.mod eventually oscillates between two states, preventing -mod readonly from ever working, and wreaking havoc with source control.

jayconrod commented 5 years ago

I took a quick look at this, need to investigate further though.

The first go install ./... command in sample1 removes a bunch of requirements from go.mod. The only reason that would happen is if the requirements are implied or redundant with some transitive requirement. Since each module is replacing itself and requiring the other module at an invalid version, I think sample1 is seeing its own go.mod file as a go.mod in a different version of sample1, so it sees its own requirements as redundant.

As I said, need to investigate further, but this seems like a strange edge case of replace. We should forbid modules from replacing themselves without specifying a version.

rselph-tibco commented 5 years ago

Thanks for taking a look. If I understand you correctly, a workaround might be to specify the versions of sample1 and sample2 in the replace statements. That's certainly something I can try. I'm worried about the implications, though, since what I'm modeling is two modules under active development locally. Would I need to increment the versions each time the source changes to deal with cache issues?

thepudds commented 5 years ago

I’m not quite following this example, but why are the modules replacing themselves?

Regarding:

what I'm modeling is two modules under active development locally. Would I need to increment the versions each time the source changes to deal with cache issues?

Again, I am not quite following the example, but I think you shouldn’t need to increment versions if you are doing local filesystem-based replaces of modules that are being locally developed.

This FAQ has an example:

https://github.com/golang/go/wiki/Modules#can-i-work-entirely-outside-of-vcs-on-my-local-filesystem

Sorry for the quick comment, and sorry if it is not helpful.

bcmills commented 5 years ago

@rselph-tibco, it should be very rare for a module to need to replace itself, since the main module itself is always considered higher-precedence than any explicit version of its own path.

The only reason I can think of for a module to replace itself would be if some other interdependent module depends on an invalid version of it, in which case it should suffice to replace the invalid version with a valid one.

In the rare case in which you are developing two entirely new interdependent modules from scratch, #33370 may be relevant.

rselph-tibco commented 5 years ago

Hi, thanks for all the comments. I arrived at the modules replacing themselves only because without it, I get this when building one module or the other (in this case, sample2):

go install ./...
++++++++++++++++++++++++++
go: foo.com/me/sample1@v0.0.0-00010101000000-000000000000 requires
    foo.com/me/sample2@v0.0.0-00010101000000-000000000000: unrecognized import path "foo.com/me/sample2" (https fetch: Get https://foo.com/me/sample2?go-get=1: dial tcp 23.23.86.44:443: connect: connection refused)
rselph-tibco commented 5 years ago

BTW, I just verified that adding version v0.0.0-00010101000000-000000000000 explicitly to the replace directives didn't change the behavior with 1.13 or tip. But I agree @bcmills, that #33370 sounds related.

bcmills commented 5 years ago

BTW, I just verified that adding version v0.0.0-00010101000000-000000000000 explicitly to the replace directives didn't change the behavior with 1.13 or tip.

Yeah, that's not surprising: it makes go mod tidy think that the dependency on the zero-version module (via sample2) provides all of the same requirements as the main module, and thus the main module's requirements are irrelevant.

So it removes those requirements from the main module, but because of the replace directive that also removes the transitive requirements upon which the removal relied.

The core of the problem is that the module system assumes that versions are immutable, but replace directives make them mutable — and, in this case, go mod tidy actually performs the mutation itself.

Probably that implies that we should also not allow the target of the replace directive to be the directory containing the main module.

bcmills commented 5 years ago

The corrected version of this example, I think, is to replace v0.0.0-[…] with a directory containing only a go.mod file that specifies no dependencies:

replace foo.com/me/sample1 v0.0.0-00010101000000-000000000000 => ./empty-sample1
bcmills commented 5 years ago

The two diagnostic fixes we can make for this are:

rselph-tibco commented 5 years ago

It looks like the empty directory trick works. I'll give this a try in our actual build environment. Thanks.

GrigoriyMikhalkin commented 4 years ago

@bcmills Hi, i want to take this task, if that's ok. Am i correct, that we want to make it impossible to replace main module with it current version? Or we just want to notify user about that and allow him to force that behavior?

gopherbot commented 4 years ago

Change https://golang.org/cl/213118 mentions this issue: cmd/go: do not allow main module to replace itself

segevfiner commented 4 years ago

If we no longer allow the main module to replace itself, how would the following scenario work: gomodtest.tar.gz

Here you have modules having circular dependency on one another, but not on the constituent packages so they build successfully. Without the replace directive, Go is trying to download a second copy of the main module to satisfy the dependency of the dependent module which seems like it will cause having multiple copies of the main module as part of the build which would lead to all sorts of unexpected consequences.

jayconrod commented 4 years ago

@segevfiner In your example, you have two go.mod files:

module github.com/segevfiner/a

go 1.12

require github.com/segevfiner/b v0.0.0-00010101000000-000000000000

//a => ./
replace github.com/segevfiner/b => ../b
module github.com/segevfiner/b

go 1.12

require github.com/segevfiner/a v0.0.0-20190919150336-4ef3d582f001

replace github.com/segevfiner/b => ./

replace github.com/segevfiner/a => ../a

It's fine for these modules to depend on each other. But it's not okay for the main module to depend on itself through replace directives. You would see the problem described in this issue.

Instead, you can replace specific versions of the main module like this:

module github.com/segevfiner/a

go 1.12

require github.com/segevfiner/b v0.0.0-00010101000000-000000000000

replace (
  github.com/segevfiner/b => ../b
  github.com/segevfiner/a v0.0.0-20190919150336-4ef3d582f001 => /tmp/null-a
)
module github.com/segevfiner/b

go 1.12

require github.com/segevfiner/a v0.0.0-20190919150336-4ef3d582f001

replace (
  github.com/segevfiner/a => ../a
  github.com/segevfiner/b v0.0.0-00010101000000-000000000000 => /tmp/null-b
)

I used null-a and null-b as the paths. These are probably directories that only contain go.mod files with no requirements. Probably not checked into source control.

I think you might want to do this if you wanted to tag a pair of modules that depend on each other at the newly tagged versions. This would be one way to test them. It's not an easy thing to do, but it can be done without replacing the main module and without the main module's directory serving as a replacement.

segevfiner commented 4 years ago

That doesn't look like something that can serve as a valid work flow... 😕 Why would I need such a strange replace directive pointing to an empty module? And having to specify the version probably breaking this every single time you will update the dependency... No?

It feels like in addition to not allowing the main module to replace itself, perhaps it should also be made to compile correctly without it needing to replace itself in the case of circular dependencies? Not sure, I don't yet fully understand why this doesn't work without it...

jayconrod commented 4 years ago

That doesn't look like something that can serve as a valid work flow... 😕 Why would I need such a strange replace directive pointing to an empty module? And having to specify the version probably breaking this every single time you will update the dependency... No?

This should not be a common workflow. It would only be needed when you need to simultaneously release multiple versions of modules that depend on each other. This workflow lets you transitively require a module version that doesn't exist yet. The replace directive pointing at an empty module effectively lets you ignore a requirement in the module graph.

In most cases, modules versions should be released independently. If a group of modules are routinely released together, it may be better to combine them into a single module.

It feels like in addition to not allowing the main module to replace itself, perhaps it should also be made to compile correctly without it needing to replace itself in the case of circular dependencies? Not sure, I don't yet fully understand why this doesn't work without it...

This doesn't work because when the main module serves as a replacement for any module, it appears that its requirements are redundant with another module (actually itself), so they are removed. You end up with an empty requirement list after any command that can modify go.mod.

segevfiner commented 4 years ago

This should not be a common workflow. It would only be needed when you need to simultaneously release multiple versions of modules that depend on each other. This workflow lets you transitively require a module version that doesn't exist yet. The replace directive pointing at an empty module effectively lets you ignore a requirement in the module graph.

Sadly it doesn't even work cleanly during development of such interdependent modules without a workaround like this. Of course, having such modules in the first place is not a good idea, sadly you are sometimes stuck with lazily designed code that has such dependencies, especially coming from the vanilla go get era.

In most cases, modules versions should be released independently. If a group of modules are routinely released together, it may be better to combine them into a single module.

This doesn't work because when the main module serves as a replacement for any module, it appears that its requirements are redundant with another module (actually itself), so they are removed. You end up with an empty requirement list after any command that can modify go.mod.

Sounds fun...

mholt commented 4 years ago

Was pointed to this issue and advised to add my data point here, in case it's related.

I'm concerned that it is practically impossible to develop Go modules in some cases without Internet access; or, in the best case with Internet access, without kind of a hacky commit to a remote server.

In developing Caddy 2 extensions, there arose a situation where two repos created a Go module cycle: https://gist.github.com/mholt/e0a14049026840e480f57c1f601f9d86 -- but not a package import cycle, so the program still compiled. Note that both modules use replace to pin local versions of each other.

There are two modules in play here: caddy/v2 and tls.dns. Both modules depended on each other, but tls.dns was still being developed only locally -- was not published yet. (No GOPATH; using only module mode.) The caddy/v2 module contained the main package which imported a library package in tls.dns.

However, all the go tool commands failed in the tls.dns workspace, including every time I tried to save in my editor (VS Code) because the save actions would never finish. Since I could not go build in tls.dns, I went to caddy/v2 and ran go build there, sure enough I was able to see the functionality from tls.dns that I had written (but which was not properly formatted or tested, because no go commands worked there).

Basically, it became impossible to be productive on my new Go module without another, already-published module to import and build it. There was no way for me to continue working on tls.dns in isolation.

@heschik advised me to push a commit to the tls.dns remote -- any commit, even one that just added a README. Which I did. That was enough to give me a commit hash which I could then do go get github.com/caddyserver/tls.dns@20832cb8a8269804599ed2ebb5918c04afb9c9c4 in the caddy/v2 repo. And magically, I was able to continue development in the tls.dns repo! The go command started working again, to my relief... but...

A few things strike me as odd/concerning/buggy/unintuitive in this situation:

IMO, no third-party go.mod should ever have the ability to DoS the go command in your own module. It should also not be required to go online to be able to run basic go commands like go build and go test successfully.

It would be nice if these problems could be addressed soon. This is a huge barrier to developing new Go modules.

bcmills commented 4 years ago
  • Pushing a dummy commit to a remote was a necessary prerequisite

To be clear, a dummy commit is not strictly necessary.

The root of the complexity here is that the main module must always be the selected version of itself. It cannot be replaced with anything else, and since it has no explicit version, its effective version must be interpreted to be higher than any other version of the same module path.

Since the main module cannot be replaced with anything else, the replace directive for an older version of the main module must name some explicit version: it cannot be versionless.

One solution, as you found, is to push an empty commit and name that version.

However, another is to decide on the version that names the initial tls.dns commit ahead of time (for example, v0.1.0) and name that explicitly in the caddy/v2 go.mod file, and have tls.dns replace that specific version of its own path instead of using a wildcard version: https://play.golang.org/p/TM8Q-jI2QXX

33370 is very relevant to this use-case, but as noted in https://github.com/golang/go/issues/33370#issuecomment-542377792, we probably shouldn't be writing out an explicit require at all if the latest version of an unresolved dependency is the subject of a replace directive.

ceejatec commented 4 years ago

I think there is still a need for this ability. It may be that #33370 also would meet the need, I'm not sure. But here's the situation I'm facing, which was the basis for the reproducer I created for #39570 :

Module "foo" exists, and has a dependency on module "bar". Those modules are being simultaneously developed, and in fact the only real reason they're not the same module is that "bar" is closed-source code, requiring it to be in a separate repository.

Because they're effectively the "same module", of course they have circular dependencies. And because they're both in development, the only correct way to build them is to have them both checked out into separate directories and use "replace" directive to point at each other.

Now, when building an executable in "foo", it finds "bar" correctly. However, "bar" imports other packages in "foo". Since only "foo"'s go.mod's "replace" directives are used in the build, this means foo/go.mod must have a "replace" directive for itself so that bar's code can compile.

This actually works today, except for the issue mentioned in #39570 that "go mod tidy" gets confused by the situation and erases some necessary entries in go.mod.

As a side note: "foo" is not a module in the sense that it is ever expected any other code in the world will import foo. It's a module because it wants to use go.mod to define its own dependencies. So there will likely never be any "releases" of foo, nor should Go require us to figure out how to "release" foo such that "bar" can depend on an explicit version.

As I said, I can't tell whether #33370 would fix this or not. If it allows us to build "foo" in such a way that "bar" resolves "foo" to the same directory containing the local "foo" code (and does so in such a way that "go mod tidy" doesn't break "go build -mod=readonly"), then that would be just fine.

jayconrod commented 4 years ago

I want to clarify a couple points of discussion.

The problem

This issue happens because it appears to the go command that the content main module's directory (in the directory where build commands are run) is also the content of a released version of the same module in the dependency graph.

This breaks some invariants. It appears to the go command that requirements in the main go.mod file are redundant since they're implied by another dependency (the module that the main module directory is replacing), so they get removed.

The workflow

This tends to come up when people are developing two or more modules that depend on each other, and it's not practical to release versions of those modules. There are several use cases for this: releasing new versions concurrently; developing offline; projects with separate open / closed parts.

The problem is triggered with a line like this (where example.com/a is the main module):

replace example.com/a => .

Instead, use this (where ../empty is a path to a directory containing a go.mod file with no requirements):

replace example.com/a => ../empty

To elaborate:

A full example (you can extract with txtar).

-- example.com/a/go.mod --
module example.com/a

go 1.15

// require example.com/b at a fake version. Any version will do.
// We need the requirement so 'go mod tidy' doesn't try to add it.
require example.com/b v0.0.0-fake

// replace all versions of b with the directory where b is developed.
replace example.com/b => ../b

// replace all versions of this module with an empty directory.
// example.com/b requires example.com/a at a fake version, so we need
// to replace that with *something* to prevent it from being looked up.
//
// This replacement is only used active when example.com/a is the main module,
// and the main module will be chosen over any other version of the same
// same module, so packages will be loaded from here, not ../../empty.
replace example.com/a => ../../empty

-- example.com/a/a1/a1.go --
package a1

import _ "example.com/b/b1"

-- example.com/a/a2/a2.go --
package a2

-- example.com/b/go.mod --
module example.com/b

go 1.15

require example.com/a v0.0.0-fake

replace example.com/a => ../a

replace example.com/b => ../../empty

-- example.com/b/b1/b1.go --
package b1

-- example.com/b/b2/b2.go --
package b2

import _ "example.com/a/a1"

-- empty/go.mod --
module empty

go 1.15

// This directory serves as a replacement for non-existent versions of the
// "main" module. It contains no packages, and only the go.mod file is loaded.
// It has no dependencies.

You can run go build ./... or go mod tidy in either module without changing its go.mod.

Changes needed

When the main module's directory serves as a replacement, the go command can't safely remove seemingly redundant requirements from go.mod. It should report an error with a clear message when this happens. It should probably hint that a directory with an empty go.mod file should be used instead.

This still doesn't give us a very intuitive workflow. Some improvements are planned already:

Some other ideas (brainstorming):

In replace, we could interpret the version none as an empty module with no requirements.

replace example.com/a => none

Or perhaps, we could just do that when all versions of the main module are replaced with the main module's directory.

replace example.com/a => .
bcmills commented 4 years ago

We could perhaps also just ignore the selected version of the main module when matching replace directives. (We never did precisely nail down the semantics of those — see #26344.)

bcmills commented 4 years ago

But that wouldn't solve the problem of replacing to the main module's directory, which we still can't allow (and which I can't justify silently ignoring).

ceejatec commented 4 years ago

I will try out the "empty" workaround, thank you for the suggestion.

I still don't understand why the (annoying but seemingly intuitive) solution of "replace example.com/a => ." must be disallowed. It means "use the content of . for all imports of all versions of example.com/a", which is the same semantics as any other replace statement. The "empty" workaround is much less obvious, since it means "use the content of ../empty for all imports of all versions example.com/a (except don't really do that because you already know example.com/a exists in the current directory)".

I'm sure the difficulty in implementing this comes down to needing to special-case that situation, since go's logic for resolving modules and transitive dependencies and saving them in go.mod is a little hairy. But from a usability perspective, from the outside looking in, it feels much more direct.