Open keks opened 5 years ago
So, this helps fix the issue of gx paths in the source code but, unfortunately, doesn't fix the problem of bubbling gx updates. Please take a look at my motivations here: https://github.com/whyrusleeping/gx/issues/179.
Basically, for me at least, the biggest time-sink is the lack of a dependency resolution system.
Using go modules out of the box will fix our some of our biggest issues with go get
: it'll work even if we make breaking changes (as long as we version properly).
The large missing piece is security and reproducibility. That's where https://github.com/whyrusleeping/gx/issues/179#issuecomment-408243162 comes in. Basically, for our builds, we can use a special gx build
tool to make sure we build with the correct, audited dependencies.
Thanks for pointing me to that issue, I didn't see that one.
Regarding you main motivation
Basically, for me at least, the biggest time-sink is the lack of a dependency resolution system.
I think if we are be able to use version strings like v2.7.3-gx-ipfs-Qmb3GBFCHMuzmi9EpH3pxpYBiviyc3tEPyDQHxZJQJSxj9
and use ipfs to resolve them and fetch the code, we will be able to have the dependency resolution features of go and the security features of ipfs.
I believe it also fulfills the reproducibility requirement, though I'm not exactly sure what you mean by that. For example when building go-ipfs with dependencies carrying gx versions, the set of module version candidates is fixed and the minimum version selection will always return the same build list. On the other hand, it might happen that go-ipfs overrides the version of a transitive dependency, so library modules might be used with different dependency versions than they were developed with. But as far as I can tell, this is a requirement to avoid having to publish a new release for every transitive dependency, just to bubble up an update of a dependency leaf, so that's a feature.
I am not 100% sure how go mod behaves when all build candidates are prereleases (i.e. v1.2.3-xyz
), and I'm not sure we can abuse semver in this way. But I'd be up to build a GOPROXY-compatible gx module server to try that out.
One caveat here: go get
will prefer non-prerelease versions, so if we let the server try to resolve non-gx modules, we'll need to disable that for modules that we have gx candidates for. Otherwise go might use those.
ping @Stebalien
I am not 100% sure how go mod behaves when all build candidates are prereleases (i.e. v1.2.3-xyz), and I'm not sure we can abuse semver in this way. But I'd be up to build a GOPROXY-compatible gx module server to try that out.
Yeah, my worry here is that go will treat this kind of version as an "opaque" version and not to semver-based dependency resolution. However, if that still works, this would be awesome.
My only remaining concern is that we can't really force everyone that depends on our stuff to use gx. We may be able to get them to at least install it (as long as they never have to deal with it directly or use it directly in their projects) but we'll have to be careful about that.
The nice thing about the "lockfile" approach is that it's entirely independent. It allows us to specify dependencies in the language's native dependency system while still allowing users to "opt-in" to the guarantees that gx provides.
@keks I've been discussing this with @travisperson and @whyrusleeping and we're currently planning on going with the lockfile for the independence reason. For us, this means:
@Stebalien I'm not 100% clear what you mean by lockfile approach. I know npm's package-lock.json and assume you want to generate a file like that for the concrete package manager in use. I don't see how we can use that in a straightforward manner so I searched for issues in the gx repo mentioning lockfiles, but that wasn't very fruiful. I have some rough ideas on how that could work, but I'm not sure. I'm especially curious whether we'd still have gx hashes in imports paths. Anyway, let me respond to the five points you brought up.
gx install
did until now, except we have to fetch the modules to $GOPATH/src/mod/...
instead of $GOPATH/src/gx/...
. Also we would be able to let the user use a public gateway for fetching the sources, so we would even let users install gx dependencies without them having any gx or ipfs on their computer.I mean https://github.com/whyrusleeping/gx/issues/179#issuecomment-408243162. That is, we somehow put all the package's dependencies (including the transitive ones) in a a single file. The format would be something like:
{
"dependencies": {
"github.com/ipfs/go-ipfs-cmds": {
"path": "/ipfs/QmId.../go-ipfs-cmds"
// Space left in case we need additional fields
}
}
We can then install (without rewriting) for development by creating a vendor directory and symlinking. E.g., symlink ./vendor/github.com/ipfs/go-ipfs-cmds/
to $GX_CACHE/ipfs/Qm.../go-ipfs-cmds
(or literally to /ipfs
if it's mounted).
To build with rewriting (to get gx paths in stack traces), we'd copy everything, including the current package, into a temporary directory rewrite according to the lockfile (it's literally a rewrite map), and then build.
As you pointed out, the "somehow put all the dependencies into the lockfile" is the interesting part. From the user's standpoint, they'll run some command (gx sync
or gx lock
) to build the lockfile. My plan is to use vgo eventually but:
go get
(i.e., go-get into an empty gopath, import everything into gx, build the lockfile). This should be very simple.On the other hand, we won't remove support for basic go get
and the gx's current package.json
.
Basically, we're separating dependency resolution from building and making dependency resolution pluggable. We'll (carefully) do dependency resolution once when updating, lock these dependencies into the "lockfile", and then use gx to actually install dependencies and build. In most cases, users will be free to entirely ignore gx and use their chosen dependency resolution tool when building.
Note: The plan is to never check-in gx-re-written paths. That's just annoying for everyone.
I have a counter-proposal for integration with Go modules.
Another feature introduced with modules was GOPROXY
. Maybe changing gx
to be a go modules proxy that reads/writes to IPFS will be a better integration point?
I imagine that you run gx proxy /some/ipfs/prefix
and it will serve all dependencies from this prefix, and if there is no such file it will try to fetch it from the network and write it to IPFS under the same prefix. Details may vary, but this will allow go-ipfs
to use the standard way of managing Go packages (modules) while giving the ability to store all of the dependencies in IPFS.
@dennwc I like that direction. I think it's worth quickly sketching out how that might work
Using GOPROXY
is a really cool idea. It's unfortunate that the download api doesn't include the tree/commit hash. That may have allowed a completely transparent IPFS proxy to be built. Maybe it's worth trying to convince the go team to add this (if it's not already to late)?
It's definitely not late, since Go 1.12 will be released in Feb with modules enabled by default. Until that time there is some time to discuss small changes to modules.
Can someone create an issue in the Go repository and describe what exactly is needed?
The 'transparent proxy' idea is also based on the assumption that the hashes in the go.sum
file are usable to lookup data on IPFS. Sadly I'm not deep enough into either project to be able to answer that right now.
If this assumption doesn't hold (which seems likely). One way of building @dennwc's original idea could be to build a storage backend or fetcher for gomods/athens. This offloads the work of building and maintaining the proxy itself. However, it would still have to store the mapping of package versions to IPFS hashes somewhere.
Yes, storage backend might be even better integration point.
For hashes, it's seems like the simplest way will be to maintain a IPFS root for dependencies of a project. For example /project-org-hash/github.com/whyrusleeping/gx/v1.0.0/...
or whatever works for Athens.
@keks @Stebalien Do you have any comments on this? Will a proxy solve mentioned issues for you?
Hey @whyrusleeping, here is my extended sketch of how gx could benefit from go modules. Also pinging @Stebalien because you wanted him to also look at this. I hope you like it!
Go 1.11 will introduce go modules, which have been proposed in Russ Cox' Go & Versioning series in the
vgo
fork. Modules impact the development workflow in a number of ways, and opens new possibilities to streamline that ofgx
.I want to build
gx-gomod
and teachgx
modules. Specifically, I want to get gx hashes out of .go files, which also simplifies things like automated rewriting.The most obvious change is the new per-module
go.mod
file, which is used to track dependencies and their versions. It allows three types of declarations:require
,exclude
andreplace
.The structure of this proposal is as follows. First, we will look into what we can and can't do with the
go.mod
file. Then, we will look into howgo get
fetches modules, and how we can make it fetch them from gx. Finally, we will look into how to build gx-gomod projects when the user has neither ipfs nor gx installed.go.mod
The go.mod file is used to specify the versions of the dependencies of a project. It allows three types of declarations:
require import/path v2.0.2
- get that import at version 2.0.2. Note that the minimal version selection algorithm will try to find a version that works for all packages that import this path, so maybe we end up with 2.0.0.exclude import/path v2.0.0
- make sure that this build will not use version v2.0.0 for this build (maybe it contained a critical security vulnerability)replace import/path v2.0.2 => some/other/import/path
- let files that import "import/path" use the code from "some/other/import/path". Note that the right hand side path may also carry a version specification.Note that each of these has a version. Actually, go.mod has a detailed specification of what a version string is. Go basically uses semver, but introduces pseudoversions. These look like
v0.0.0-<committime>-<commithash>
, and are also valid semver strings.This approach can be adopted for gx, by using version strings like
v0.0.0-gx-ipfs-Qm...
. This allows to treat a gx hash not as an import path, but as a version, which is a much better way to think about it.Unfortunately, go's automatic version finding algorithm will not be work with these, because they are all pseudoversions. One way to tackle this is to have a list of versions and where to find the code behind an ipns hash. That way we have an import path that remains the same, and new versions can be amended.
However, I'm not sure if this kind of mutability is wanted or not.
Download Protocol
Currently, most go modules are hosted on github, and the developer of the go tools have hardcoded a good way to get the data they need from that there. However, they also provide an interface that can be implemented by others. This allows building module caches or distributing private go modules inside a companies network. It also allows us to resolve the pseudoversions and deliver the modules from gx.
There are two ways to make
go get
to use that protocol. The first way is like vanity imports for modules and requires includingin the html returned when querying
https://mydomain.com/import/path?go-get=1
. The second one is used, when theGOPROXY
environment variable is set. For example, we could set it tohttp://localhost:8060/
and run a cache proxy on that port. Instead of a cache proxy, we could however also resolve packages through non-standard channel, such as gx/ipfs.The download protocol itself is simple and HTTP-based. It is described in the section "Download Protocol" of part six of the G&V series (sorry no fragment link):
Well, that seems simple enough to implement! If we run a server like that locally to resolve our gx modules, all we need to do is
But what do we do about traditional imports? It would be nice to fetch those the traditional way and forward them to the client. This sounds simple, but there are some problems. Let's look at the obvious solution: if it's a gx hash, deliver the code from ipfs hash, else, do what
go get
would usually do and deliver that. The problem here is, that "whatgo get
usually does" is somewhere in "cmd/go/internal/...
", so we can't just import it, but have to fork it, which means it's a lot of effort to maintain. An alternative is to redirect the request to a well-known proxy or registry. For efforts in this direction, look at gomods.io. The people gomods also plan to fetch source code from github et al., so maybe we can reuse their code that does that. However, I haven't looked at it long enough to say whether that would work. They have community calls, so if we decide to go down this road, we can just chat with them and see whether they have a good idea.Bootstrap is a PITAWhat I find especially interesting about this solution is that we can run a public gx module proxy. That way it is dead simple to download, build and use gx modules (like ipfs) without even having gx or ipfs installed, by using
GOPROXY=https://goproxy.ipfs.io go get
. Once everything is installed, the user can switch to a localGOPROXY
.Furthermore, it might be possible to apply as a backend for gomods. That way, we don't have to host that public proxy ourselves, but gomods would use ipfs to fetch the code and then host it for us. I'm not sure they are interested in something like this, considering that go mods is all over semver and we only use pseudoversions, I can imagine that they are not very excited about this.