Open bcmills opened 5 years ago
A way to explicitly populate the module cache from source might also help in cases where the original source path is blocked or unavailable but the code is available from a trusted mirror (as in #28652).
We have go mod download to add to the module cache. We have go mod clean -modcache to clear it. Do we really need more fine-grained control? I fear that will make people manage it more.
Never mind I didn't understand the problem being solved.
Change https://golang.org/cl/153819 mentions this issue: cmd/go/internal/modfetch: skip symlinks in (*coderepo).Zip
Ping @bcmills to summarize our discussion from 2 weeks ago about alternatives to meet the need you are trying to address here.
Change https://golang.org/cl/153822 mentions this issue: cmd/go/internal/modfetch: skip symlinks in (*coderepo).Zip
On the Linux distribution side, you need almost the same thing. The whole process is:
So you’d need almost the same, with a little tweak: deployment and indexing need to be separated
go mod pack --version version [PROJECT_DIR] STAGING_DIR
(of course it sucks that you need a separate version argument and it is not already present in the go.mod file). PROJECT_DIR defaulting to .
and containing a go.mod file.go mod reindex PROXYDIR
(though since PROXYDIR will usually be standardised it should be read in a system conf file or an env var, not specified every time a reindexing needs to be place)There is no concept of cleaning up the module cache, since all files are supposed to be associated with a single system component, so the system manager knows how to clean up them without help. I suspect this part won't go well with the proxy protocol as defined today since some files are shared between different versions of the same module (but .so file symlinks are pretty much the same mess so that should be manageable with a few hacks)
Lots of Linux subsystems, from python to fontconfig, behave this way today, that's a proofed deployment design pattern that is easy to integrate system-side
@nim-nim, there is no “indexing” step in the module cache. Either the requested version is there, or it isn't.
@bcmills Then how is $GOPROXY/<module>/@v/list
supposed to be generated?
You can go mod pack
mymodule version x.y.z in system component golang-mymodule-x.y.z
, that will contain
$GOPROXY/mymodule/@v/x.y.z.mod
$GOPROXY/mymodule/@v/x.y.z.info
$GOPROXY/mymodule/@v/x.y.z.zip
and then you can go mod pack
version a.b.c in another system component golang-mymodule-a.b.c
, that will contain
$GOPROXY/mymodule/@v/a.b.c.mod
$GOPROXY/mymodule/@v/a.b.c.info
$GOPROXY/mymodule/@v/a.b.c.zip
So far so good every file is nicely accounted for and the system component on-disk representation does not clash (even though having to manage a separate info file just because the module file does not contain the version is annoying).
But depending on whether the user installs only golang-mymodule-x.y.z
, only golang-mymodule-a.b.c
, or both $GOPROXY/mymodule/@v/list
is not supposed to have the same content isn't it? So you need to reindex $GOPROXY/mymodule/@v/list
on installation/uninstallation of anything in $GOPROXY/mymodule/@v/
In rpm tech that would mean adding a %transfiletriggerin
and a %transfiletriggerpostun
on the $GOPROXY
directory that calls a go subsytem command to reindex all the stuff inside $GOPROXY
every time the system component manager adds or removes things in it (rpm documentation)
The module cache is a cache. I really do not want the module download cache to have manual maintenance. That was the big problem with $GOPATH/pkg and go install
: go install
was manual maintenance of $GOPATH/pkg. The new build cache has no maintenance, which simplifies everything and eliminates a lot of awful failure modes. We'd really like the same for the module cache.
The operation being created above is really "pretend this module version has been published, so I can build and test other modules that depend on it". It's not clear to me that that should be scoped to a whole machine (a whole $GOPATH). At the very least it seems like we need two commands:
A build should never default to using the fake-published stuff. Then you can't do two logically separate things in a single GOPATH and we're back to manual cache maintenance a la go install. That is, if I'm in the middle of testing one fake-published module 1 against another module 2 and I get an interrupt and context switch to something completely different module 3 that happens to also depend on module 1, I don't want to have no way to get back to the real world where there isn't a fake module 1 floating around. That should be the default world I'm in. Otherwise the mental load of managing this automatically-used staging area is much like $GOPATH/pkg and go install.
I can't remember exactly what @bcmills and I discussed in late Nov 2018 but I think it was some other mechanism that wasn't "the module cache" for fake-publishing. You could imagine saying "fake publish to configuration foo" and then "build with configuration foo" and even "list configuration foo". Or maybe there's just one fake-published-world per $GOPATH.
@rsc It's not fake-publish, it's using your own code, only without forcing people to use github or artifactory in the middle. In the actual "real world" you have lots of situations where roundtripping to the github just to use your own code is not acceptable. So please make this use case work cleanly without artificial fake publish degradation, or people will just reverse engineer how go mod works and write their own tools you won't be happy with (already starting to, because modules are pushed before the tooling is finished and ready).
When you don't own your cloud like Google, when you don't have fat network pipes, when you have restricted networks because of $expensive and $dangerous factories plugged here, you don't roundtrip to the Internet all the time just because it's cool at home to look youtube videos.
As written in the module FAQ
Rather, the go tooling in 1.11 has added optional proxy support via GOPROXY to enable more enterprise use cases (such as greater control)
Greater control means greater control, and people doing the stuff they want with their code without opaque cloud intermediaries.
Besides making access to some remote VCS mandatory just to make use of some code, would make Go instantaneously incompatible with every single free software license out there.
@nim-nim I don't understand your response. I completely sympathize with the use case here and I spelled out a path forward that avoids the network. My use of "fake-publish" was not derogatory. I am referring to the operation of making it look locally like the module has been published even though it has not, hence "fake publish".
I am not sure about other 2 commands in this proposal, but I think go mod pack
is something that is going to be really needed by many developers. I know these comments are really frowned upon here, but in many already long established tools/ecosystems this functionality is deemed as must have. First comes to mind is Maven where you can publish artifact to local cache from local code.
Consider a project A that depends on library B. Often times developers want to develop and publish v1.2 of both A and B at the same time. How can I import module B v1.2 that I am working on locally to my project A that I am also working on locally? As of now (1.13b1) there does not seem to be any mechanism to achieve this without manually hacking into go.mod with replace
and subsequently removing it from go.mod (again manually I presume) before publishing both.
The concept of pre-fill the module cache with a local (non published) module, or with a new revision not yet published, can be implemented with an external command.
Here is an implementation: https://github.com/perillo/gomod-pack.
it calls go mod download -json
with a custom environment, where git
is configured with URL rewriting and go
is configured with direct access and disabled checksum database.
gomod-pack
can only be called inside a module, and the user can only specify the version to pack.
It prints to stdout the versioned module path, that the user can use in a go.mod
require
directive.
The only drawback is that it only works with git.
That was the big problem with $GOPATH/pkg and
go install
:go install
was manual maintenance of $GOPATH/pkg. The new build cache has no maintenance, which simplifies everything and eliminates a lot of awful failure modes. We'd really like the same for the module cache.
Hi, I'm coming from Issue #37554 and have just read this. I had no idea that "go install" was going to become deprecated!...Maybe this needs clarification in the community?
In my issue, I suggested that "go install" do the same as "go mod pack" in this proposal (and I prefer that way of expressing the command as it's the same as previous go-versions). I agree with @nim-nim as we both seem to want a fairly simple use case (local code using modules, not hitting the network), but the current implementation of modules makes this tricky to say the least.
I just finished reading this whole thread because I hit into this same issue while developing a new app for an enterprise product. I'm still very new to Go, but no provision to import a separate module that I'm developing in parallel seems like a huge oversight.
Let me try to summarize my use-case:
myapp
module, which is reliant on mylib
module. mylib
is also still under development and not published anywhere. futureapp
that would want to use mylib
. All 3 are delivered as part of the same ISO for enterprise customers, so having single repo makes version management much simpler.As of now, there's no way for myapp
to import mylib
without adding the replace
directive in the go.mod
for myapp
, which feels very hacky. I have to publish mylib
separately without myapp
and then update myapp
go.mod
file to remove the replace
directive.
Another use-case is - when I'm developing a library that's used by multiple modules, I need to run integration tests for the dependent modules to make sure I'm not introducing any regressions. So now I need to change all the dependent modules' go.mod
file and add a replace
directive to point to the local module.
@marystern 's idea about go install
installing the unpublished module locally in the cache sounds like a really good idea. That's how many build management systems work as well. Maven lets you build and install a jar/war file to the local maven repo for other Maven projects to import.
@ronakg, this is not really on the topic of this exact issue but the fact that you are using multiple modules in the same repository for your use-case seems to be an anti-pattern. In general multi-module repositories are not a recommended workflow.
In your specific case (based on the information you have provided) there should not be any reason for having multiple modules. Simply put your library and your app in the same module which should be rooted at the root of your repo. And if a new app using your library needs to be created it can live in the same module & repo as well.
Modules are a dependency-management & versioning abstraction, not a feature-level abstraction. Hence if everything (the library and the binaries) are part of the same product and will be shipped and versioned in a common fashion then they can all be part of the same module without any negative side-effects. Using multiple modules would actually make achieving your goals much harder and your day-to-day development workflows much more complex.
Based on discussion with @bcmills, @jayconrod, @matloob, putting this on hold because we need to think about the higher-level issue of publishing modules at all first. This issue was primarily intended to address publishing a collection of modules that depend on each other, perhaps in a cycle or perhaps not. That's the problem to solve; reusing the module cache is probably not the right solution.
Placing on hold to come back with a different solution.
May I add that [#44989] and [#32976] have been marked as duplicate on this current one, but for CGo the impact is heavy to not yet have a command to "clean only one module" from the cache.
Indeed when modifying a C/C++ source file in CGo outside of compiled package, there isn't an easy way to force rebuild of the compiled package beside cleaning ALL module cache, which has an heavy recompile time cost, especially when it's not the only CGo module in the whole project...
If you would want to argue that keeping those C/C++ source files outside of the specific compiled package directory is not a good practice, please keep in mind that keeping those C sources files separate in their own directory allows them to be in a git-subtree
dedicated folder, and permits to follow C++ upstreams and emit diff-to-upstream patches easily.
For a number of use-cases, it would be helpful to be able to upload modules to the module cache from source code (not just zip files!) in a local directory or repository.
Some examples:
replace
directives).To support those use-cases, I propose the following subcommands:
go mod pack [MODULE[@VERSION]] DIR
: construct a module in the module cache from the module source code rooted atDIR
(at versionVERSION
). If theMODULE
is omitted, it is inferred fromDIR/go.mod
. If@VERSION
is provided, it must be a valid semantic version, andgo mod pack
fails if that version already exists with different contents. If@VERSION
is omitted,DIR
must be within a supported version control repository, andgo mod pack
will attempt to infer the version from the repo state (commits and tags).go mod unpack MODULE[@VERSION] DIR
: download the contents ofMODULE
toDIR
. If@VERSION
is omitted, use the active version from the main module (if any), orlatest
if no version is active. (In contrast togo mod vendor
,go mod unpack
would unpack the entire contents of the module — not just the packages in the import graph of the main module.)go clean -m MODULE[@VERSION]
: removeMODULE@VERSION
from the module cache. If run within a module, also remove the corresponding entry from itsgo.sum
file. If@VERSION
is omitted, remove all versions ofMODULE
from the module cache (andgo.sum
file).CC @hyangah @jadekler @rsc @myitcv @thepudds @rasky @rogpeppe @FiloSottile