golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.72k stars 17.63k forks source link

cmd/go: go mod -vendor prunes non-package directories #26366

Closed ainar-g closed 6 years ago

ainar-g commented 6 years ago

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

go version devel +8a33045 Fri Jul 13 03:53:00 2018 +0000 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ainar/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/ainar/go"
GOPROXY=""
GORACE=""
GOROOT="/home/ainar/go/gotip"
GOTMPDIR=""
GOTOOLDIR="/home/ainar/go/gotip/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build865981061=/tmp/go-build -gno-record-gcc-switches"

What did you do?

#!/bin/sh
set -e
export GO111MODULE=on
mkdir foo
cd foo
cat <<EOF > main.go
package foo // import "foo"

import "github.com/gen2brain/aac-go/aacenc"

func f() { _ = aacenc.VoPidAacMdoule }
EOF
$GO mod -init
$GO mod -vendor
$GO build -getmode vendor

What did you expect to see?

Successful build.

What did you see instead?

# github.com/gen2brain/aac-go/aacenc
vendor/github.com/gen2brain/aac-go/aacenc/aacenc.go:4:19: fatal error: voAAC.h: No such file or directory
 //#include "voAAC.h"
                   ^
compilation terminated.

The problem is that go mod -vendor pruned the external/aacenc folder, which contains the C code needed to build this package.

bcmills commented 6 years ago

The cause here seems to be that we prune out directories that are not Go packages.

There is no voAAC.h in github.com/gen2brain/aac-go/aacenc/, but I do see one in github.com/gen2brain/aac-go/aacenc/external/aacenc/include/.

kardianos commented 6 years ago

Side note from govendor. By default govendor only picks up pkgs referenced, but I had to add an option to grab the whole sub tree for this very reason.

Mostly cgo deps, but sometimes other required resources.

...

Right now I'm considering making a tool that can be run after go mod -vendor to trim unused packages. But I can't leave what isn't there.

myitcv commented 6 years ago

I wonder whether in the world of modules what some/most people are actually intending to do in this situation (namely where they think they want/need to go mod -vendor) is snapshot the current module dependencies as opposed to the packages. Doing so would seem to address all of these points?

So then a go mod -modvendor (command/flags to be :bike:-shed-ed) would create a directory (say ./modvendor) that could then be used by GOPROXY=file:///path/to/modvendor.

That said, I don't use go mod -vendor and likely never will. But I do maintain an append-only cache of the modules that my CI then uses via GOPROXY=file://...

ainar-g commented 6 years ago

I don't think anyone actually wants to vendor everything. There are two problems here, that I think should not be mixed:

  1. Non-Go code dependencies vendoring. This is what this issue is about. Whether we like it or not, Go has CGo. And C dependencies are hard. Some approaches to solve this problem (excluding the painful choice of teaching the go tool about C dependencies... the horror):

    • Let module authors declare their non-Go code dependencies in their go.mod. Something like

      external "github.com/gen2brain/aac-go/aacenc/external/aacenc"

    • Define and document a convention for such dependencies (e.g. external directory, similar how we have conventions for internal and testdata).

    • Explicitly give up and document that CGo is fundamentally incompatible with go mod -vendor. After all, even if you do have the C code in your vendor, there are no guarantees that that C code doesn't have some dependencies of its own, so hoping that /vendor will save you from build failures is rather naïve.

  2. Binary tool dependency vendoring. This may or may not be a problem Go modules want to solve. @rsc said it in #24051.

    It is not a goal to expand into "linux distribution" territory.

    But then this, too, should be documented explicitly. It would be a shame though, as AFAIK other systems, like Ruby's Bundler and JS's NPM support some version of it.

kardianos commented 6 years ago

One way to solve the cgo problem is to require a dummy go file in the c folder that is then imported by another package that uses the c files with a "_" import.

I think this is only a problem because the c and h files live in a separate directory from the go files.

rsc commented 6 years ago

Sorry, but the model for go builds is that what's needed to build a package is in that directory, not subdirectories. The way gen2brain/aac-go is structured, modifying the C source code in the subdirectories will not trigger a rebuild. Instead your build will use stale cache entries. The C code needs to be moved up to the package directory to make it build properly, not to mention vendor properly.

If anyone wants to build a tool that adds to the vendor directory after go mod -vendor, you could do that by reading the modules.txt file and using commands like go list -f '{{.Dir}}' to find out where each package was copied from.

ainar-g commented 6 years ago

@rsc I see two opportunities for improvement here. Firstly, should we explicitly clarify in the cgo documentation that the go tool notices only changes in Go packages? Right now the wording doesn't mention other directories and local includes at all. Something like

<...> Any .h, .hh, .hpp, or .hxx files will not be compiled separately, but, if these header files are changed, the C and C++ files will be recompiled. Changes in files included from other directories will not trigger a rebuild. <...>

(Emphasis = added by me.)

Secondly, should such local includes be a vet warning? There is already -cgocall, why not -cgoinclude?

gopherbot commented 6 years ago

Change https://golang.org/cl/125297 mentions this issue: cmd/cgo: document that #including source files in subdirectories is a bad idea

pkieltyka commented 6 years ago

this tool might be helpful to others: https://github.com/goware/modvendor — I wrote it to easily copy .c, .h, .s, .proto files from module pkgs into my local ./vendor. Feel free to improve :)

im-kulikov commented 6 years ago

go mod vendor drops github.com/ethereum/go-ethereum/crypto/secp256k1/libsecp256k1 files 😔

myitcv commented 6 years ago

I've just added a "proposal" under https://github.com/golang/go/issues/27618 (not marked as a proposal because the modules stuff is experimental, so more a discussion issue than anything). Apologies for the modvendor name clash @pkieltyka, I just spotted that.

nomad-software commented 6 years ago

I can't believe the proposed solution to this issue is to write another tool to finish what go mod vendor starts. Either write a tool to replace it or fix it so it copies everything. What is the reason for only vendoring half of the repository?

myitcv commented 6 years ago

@nomad-software Russ outlined a response to this in https://github.com/golang/go/issues/26366#issuecomment-405683150.

If instead you want to "vendor" the module that will give you everything. That's the subject of https://github.com/golang/go/issues/27618 and a proposed approach is outlined at https://github.com/myitcv/go-modules-by-example/blob/master/012_modvendor/README.md.

thepudds commented 6 years ago

@nomad-software this issue here was initially reported when the original reporter was using cgo.

Are also hitting this with cgo, vs. are you hitting this without using cgo?

If you are hitting this with cgo, and if we set aside vendoring for a moment, do you think you might be hitting problems even just during non-vendor-based building/recompiling due to the behavior described here:

https://golang.org/cmd/cgo/

When the Go tool sees that one or more Go files use the special import "C", it will look for other non-Go files in the directory and compile them as part of the Go package. ... Note that changes to files in other directories do not cause the package to be recompiled, so all non-Go source code for the package should be stored in the package directory, not in subdirectories.

(It is worth reading the full description at that link, but wanted to include at least a snippet for convenience)

nomad-software commented 6 years ago

go mod vendor drops github.com/ethereum/go-ethereum/crypto/secp256k1/libsecp256k1 files

@im-kulikov I've written a tool to fully copy the entire directory of all dependencies into the vendor folder. You can find it here:

https://github.com/nomad-software/vend

Any suggestions to improve it are welcome, just open an issue there.

shauryaramakrishnan commented 5 years ago

https://github.com/nomad-software/vend

Yo @nomad-software this should be for of Dep tool itself, awesome work! solved my issue brilliantly 👍

Cheers!

thepudds commented 5 years ago

@nomad-software I wanted to briefly double-check on the approach for https://github.com/nomad-software/vend. It seems to be intended for use with CGO (if I am not mis-understanding the README there) including when C/C++ headers and files are outside of directories with .go files.

It seems that approach of having C/C++ headers and files outside of directories with .go files might be dangerous for build issues? (Forgetting about vendoring for a second; just wanted to ask about building).

Comments above from Russ seemed to indicate:

The C code needs to be moved up to the package directory to make it build properly, not to mention vendor properly.

And (with emphasis added):

modifying the C source code in the subdirectories will not trigger a rebuild. Instead your build will use stale cache entries.

Also, the CGO documentation now includes (also with emphasis added):

Note that changes to files in other directories do not cause the package to be recompiled, so all non-Go source code for the package should be stored in the package directory, not in subdirectories.

Sorry if I am mis-understanding...

nomad-software commented 5 years ago

@thepudds It depends entirely on your build process. The goal of vend was just to copy everything to the vendor folder, just in case the remote repository goes offline. It's up to you to manage any stale CGO cache entries, etc.

vcaesar commented 5 years ago

go mod vendor command can not copy c and other files to the vendor directory, this makes a lot of packages have problems, maybe can add go mod vendor -a.

radu-munteanu commented 5 years ago

I just wonder if all these vendoring problems will disappear one day... Don't get me wrong, I like the vendoring concept way more than downloading dependencies over and over. Offline builds are great. I am just sick of changing tools with even worse tools than before. We had a bunch of tools in a pretty short period of time: Godeps, govendor, dep, go mod. When it will all end?!

adewes commented 5 years ago

We also run into problems with this as we have non-Go files in a repository that contains e.g. SQL scripts (for database migrations) or config files (for testing). Before trying to switch to go modules we used vendoring via git submodules, which worked well and allowed us to produce deterministic builds. Now we would like to switch to go modules (mainly to be able to easily pin versions of external dependencies as well). We tried to use go mod vendor to install the selected versions of our dependencies into the vendor folder and then use the test config from there, but since it only copies Go files all the config files are missing.

It would be great if there was a flag that one could set to copy the entire repository to the vendor folder. This should not be difficult to implement as go mod already stores the entire repositories in $GOPATH/pkg/mod. The alternative would be to manually check out the repository and initialize the required config files, which is cumbersome as we'd have to find a way to fetch the commit corresponding to the one referenced in the go.mod file to obtain the files belonging to the specific commit.

Software doesn't only consist of source code, config files are just as important and it should be possible to manage the two using the same tools whenever possible.

bcmills commented 5 years ago

@adewes, why is it important that the SQL scripts and config files be in the vendor directory in particular? I don't know of any reason you couldn't combine Git submodules for those non-Go dependencies and Go modules for everything else.

adewes commented 5 years ago

Well the SQL scripts belong to a specific version of the Go code (they define the database tables for the models defined in Go), so they should be versioned together. Of course we could use submodules for that but this would introduce another type of versioning (go module version and git submodule version), so I really don't like it. The files don't need to be in the vendor directory but the go mod vendor would provide a convenient and predictable way to check out the repositories belonging to the Go code that's used (and before learning that only Go files are checked out I thought this could solve our problem). Most other "modern" languages (e.g. Python, Javascript, C#) support bundling non-code files when building packages, so I really don't understand why Go shouldn't.

How are package maintainers and package providers supposed to deal with config or resource files when using the new module mechanism? I couldn't find anything in the documentation. If I want to publish my SQL scripts together with my Go code how would I do it such that people can use go get to fetch my package and my package knows where it should look for config files?

radu-munteanu commented 5 years ago

@bcmills how about this case, I have a lib with a few packages but no root go file. The mod won't bring anything from the root, that includes the README, LICENSE and so on. Maybe I want to read the README without having to open my browser to go to the lib page on git, or check the lib license.

bcmills commented 5 years ago

@adewes, the key thing to understand is that go mod vendor operates on packages, not on modules. Any files that are colocated with a package (transitively) imported by the main module will be included in go mod vendor.

If you want to unpack entire modules instead, try go mod download -json.

adewes commented 5 years ago

@bcmills thanks for that info, I understand that go mod vendor checks out packages, but (I think) that's what we need, as we want to check out the files belonging to a package with a specific version used by the module. So a -a flag that would allow us to check out all files (not only Go files) would be really great. Using go mod download -json seems quite cumbersome again, as we would need to parse out the exact file locations from the JSON, which I think is unnecessary: Go already downloads a specific version of a package, so why not provide the files of that package in an easily predictable path (so we could e.g. fetch config files from vendor/<package name>/.../config)? That would make working with non-code files so much easier.

Right now we solved the problem by merging our two dependent packages together (so that we don't need to go get the second one), which works but is not a viable strategy for third-party dependencies.

coolaj86 commented 5 years ago

I don't see anyone mentioning text files intended for use with the template package.

How will that work?

myitcv commented 5 years ago

@coolaj86 - ~see https://github.com/golang/go/issues/26366#issuecomment-405683150; Russ' response about C files applies to template txt files too.~

Another (in my view preferable) alternative to this is to use something like https://github.com/jteeuwen/go-bindata in order to make the contents of those text files effectively a first class dependency citizen.

Also, just to note that this issue is closed so I only saw this by chance; others will likely not see this exchange. Please create a new issue if you think something has been missed in this thread.

myitcv commented 5 years ago

Russ' response about C files applies to template txt files too

Sorry, I replied too hastily there. Russ' comment about C files does not apply to other non-Go or C files. The second part of my response (using go-bindata or similar) should however be correct and considered the right way to fix this.

jackielii commented 5 years ago

Sorry, but the model for go builds is that what's needed to build a package is in that directory, not subdirectories. The way gen2brain/aac-go is structured, modifying the C source code in the subdirectories will not trigger a rebuild. Instead your build will use stale cache entries. The C code needs to be moved up to the package directory to make it build properly, not to mention vendor properly.

If anyone wants to build a tool that adds to the vendor directory after go mod -vendor, you could do that by reading the modules.txt file and using commands like go list -f '{{.Dir}}' to find out where each package was copied from.

following @rsc's hint, I managed to make the following "hack" in my Makefile:

versionPath=$(shell go list -f {{.Dir}} github.com/gogo/protobuf/version)
gogoprotos=$(versionPath:version=protobuf)
gogoprotosTarget:=./vendor/github.com/gogo/protobuf/

.PHONY: vendor
vendor:
    @go mod tidy -v
    @go mod vendor -v
    @go mod verify
    # work around to save proto files in gogoproto
    @mkdir -p $(gogoprotosTarget)
    @cp -rf --no-preserve=mode $(gogoprotos) $(gogoprotosTarget)
karalabe commented 5 years ago

Sorry, but the model for go builds is that what's needed to build a package is in that directory, not subdirectories.

Cute idea. I challenge you to use Tor inside a Go program without pre-building it and dumping the shared library inside pkg for Go to pick it up. This would be a nice example to what devs out in the wild might actually need.

siennathesane commented 5 years ago

I found this issue after strugging with https://github.com/go-goracle/goracle/issues/155.

@bcmills the use case is github.com/go-goracle/goracle needs the odpi directory, which contains cgo header and implementation files, but it's not referenced as a package outside of the cgo comments requiring those header files. I'm not going to use git submodules + go modules, because then I have to keep multiple versions of things in sync, or use a custom build script, which I'm not interested in creating or maintaining.

The frustration on my end is that I feel like I'm being punished for using go mod -vendor. How am I supposed to keep my dependency's dependency in sync with my project when there's no clear, simple, concise way for that to happen, or even understand that it was missing in the first place?

boreq commented 4 years ago

The same problem exists if one tries to use Go modules for pulling in a dependency which specifies its interface as a set of *.proto files. There is no way to use those in your project as they will not be included in the vendor directory. Right now the solution seems to go back to writing bash scripts.

bcmills commented 4 years ago

@boreq, the Go source files generated from .proto files are supposed to be committed to version control.

Per https://golang.org/s/go1.4-generate:

once things are settled, the author commits the generated files to the source repository, so that they are available to clients that use go get

boreq commented 4 years ago

@bcmills Compiling .proto files can depend on .proto files from other sources though.

metalogical commented 4 years ago

Comments above from Russ seemed to indicate:

The C code needs to be moved up to the package directory to make it build properly, not to mention vendor properly.

And (with emphasis added):

modifying the C source code in the subdirectories will not trigger a rebuild. Instead your build will use stale cache entries.

Also, the CGO documentation now includes (also with emphasis added):

Note that changes to files in other directories do not cause the package to be recompiled, so all non-Go source code for the package should be stored in the package directory, not in subdirectories.

This is notably in conflict with how TensorFlow includes C headers from its Go bindings: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/go/attrs.go#L20

Since the C headers are shared, the only way to put them in the Go package directory would be to duplicate the header files, which seems like a poor solution. Is there a better way? I'm happy to fork TensorFlow to make it vendor properly if necessary.

veqryn commented 4 years ago

@mxplusb @fedepaol and anyone else coming here from the future: I've solved this problem for goracle / godror by adding dummy golang files to the directories containing external C source code. Then to ensure that go mod tidy doesn't remove them, you require each of those packages from the root package. Lastly, you put build tags on the dummy golang files so that you don't encounter the error C source files not allowed when not using cgo

You can see it done here: https://github.com/godror/godror/pull/38/files

siennathesane commented 4 years ago

@veqryn thanks! I do believe it still doesn't address the root of the problem but it does help. :)

donmiguelenriquez commented 3 years ago

I can't believe the proposed solution to this issue is to write another tool to finish what go mod vendor starts. Either write a tool to replace it or fix it so it copies everything. What is the reason for only vendoring half of the repository?

any news here ? this issue is blocking ...

nomad-software commented 3 years ago

This issue is now two and a half years old and it affects a lot of people. Has there been any movement whatsoever on this problem?

myitcv commented 3 years ago

This issue is closed, and has been for almost 2.5 years. People don't generally follow closed issues, so please open a new issue if you feel there is more information available here. Thanks

donmiguelenriquez commented 3 years ago

The issue the way it is currently seems to be very well explained and valid. @rsc closed it back then asking for someone from the community to fix it ? but its quite bizarre, because go mod was supposed to fix dep, but this feature actually works well on dep, but there is no clear way to make it work with go mod. I dont understand what is the point to make everyone move to go mod, when it is actually not feature complete. For now, I managed to solve it with the work around of using (https://github.com/nomad-software/vend) but in reality, this is something that should be fixed here.

radu-munteanu commented 3 years ago

Wow, two years have passed and still no proper vendoring with the go mod. At this point, I think it's a political decision from Google. People want full repo dependencies copy into the vendor directory, I think it's trivial, yet we still don't have it.

nomad-software commented 3 years ago

@radu-munteanu Try this: https://github.com/nomad-software/vend

zscole commented 3 years ago

lol this is a 2.5 year old issue. why does everything have to suck?

thomasmodeneis commented 3 years ago

this still an issue, its crazy how we still need to use vend to solve the package issue....

hrissan commented 3 years ago

We use go mod vendor with a CGo module that references .C and .H files stored in its subdirectory, and they are not copied by the tool.

Perhaps, tool could support an option in go.mod file to list modules that should be fully copied, instead of stripped?

thepudds commented 3 years ago

@hrissan Partially re-using an older comment:

If you are hitting this with cgo, and if we set aside vendoring for a moment, there is a decent chance you are also having a more fundamental problem even just during non-vendor-based building/recompiling due to the behavior described here:

https://golang.org/cmd/cgo/

When the Go tool sees that one or more Go files use the special import "C", it will look for other non-Go files in the directory and compile them as part of the Go package. ... Note that changes to files in other directories do not cause the package to be recompiled, so all non-Go source code for the package should be stored in the package directory, not in subdirectories.

Does that seem like it applies to you?

rezaalavi commented 3 years ago

We have a legal requirement to add the license file of some packages to their vendor directory. Unfortunately just copying go files and not having an option for other files can become a serious issue.

jcchavezs commented 2 years ago

Any chance we can restart this discussion? A flag to declare assets for a package would make sense? I guess something similar happens when using the embed annotation.