whyrusleeping / gx-go

gx subtool for golang
MIT License
80 stars 28 forks source link

make paths compatible with 'go get' #2

Open hackergrrl opened 8 years ago

hackergrrl commented 8 years ago

roadmap

It'd be a huge win for gx if the paths it generated were compatible with vanilla go get: this would keep gx projects working for everyone outside of the ecosystem, and not break downstream vanilla dependents.

how?

go get hard codes nice-looking path support for special domains (github, bitbucket, googlecode), but any others need to use a slightly more awkward syntax:

import "example.org/repo.git"

The suffix (.git) denotes the VCS used.

public gateway

go get requires a centralized domain for retrieval: the http://ipfs.io gateway is a simple and reliable candidate for distribution.

gx-go path rewriting

We can rewrite paths to take the form

import "ipfs.io/ipfs/QmdQbpEKwuiZj796Per1eD4AJmXRPSoWH5CMHL64HkREw1/go-multihash.git"

Which is roughly as readable as the current paths, and still puts all gx repos inside a consistent location ($GOPATH/src/ipfs.io/ipfs/) that strongly suggests it's ipfs-related.

If we wanted stronger namespacing, we could add gx to the path (so, ipfs.io/ipfs/Qmfoobar/gx/go-multihash.git).

gx-go publish

To be compatible with both a) git HTTPS cloning, and b) go get retrieval, the following steps would need to be taken to publish an existing git-based go repo that will have the above compatibility:

$ git clone --bare /home/noffle/go/src/github.com/jbenet/go-multihash go-multihash
Cloning into bare repository 'go-multihash'...
done.

$ (cd go-multihash && git update-server-info)

$ ipfs add -rw go-multihash
added QmWeKwYTKwBwVd7AKioXTmtpLFxTk3MBBk2ef2JtwCccAi go-multihash/HEAD
added QmTVv1kEVVoKUokZgjpXvWu6JtUhi9RVumnspttDDMCmh1 go-multihash/config
added Qmdy135ZFG4kUALkaMhr6Cy3VhhkxyAh264kyg3725x8be go-multihash/description
...
added QmdEX26zCq4jpZ5MyyKGrG3gkQYWaK5Eo644fUkkELyU9f go-multihash
added QmPsaz1Try2NRVEF37B37qmAq88NyiBooevbGk6j2G35FY

From here we should be 100% go get compatible:

go get ipfs.io/ipfs/QmPsaz1Try2NRVEF37B37qmAq88NyiBooevbGk6j2G35FY/go-multihash.git

blocking problems

Before cloning, go get has a series of fallbacks for resolving an import path. For git, it issues git ls-remote against the import path in this order

  1. git://
  2. https://
  3. http:// (if -insecure is present)
  4. git+ssh://
  5. ssh://

Since git:// comes first, go get will issue a newline-terminated string like

006agit-upload-pack /ipfs/QmdQbpEKwuiZj796Per1eD4AJmXRPSoWH5CMHL64HkREw1/go-multihashhost=127.0.0.1:4002

to the gateway. This isn't HTTP, so behaviour is undefined. Some web servers will terminate one they see it, others will hang -- keeping the connection open indefinitely. Unfortunately, the ipfs gateway does the latter. As a result, the https fallback isn't reached.

backwards compatibility

These changes shouldn't break old users of gx-go -- the deps they have installed form valid go import paths, so they'll keep on working locally. When they republish though they'll be able to share new go get-compatible paths.


cc @whyrusleeping @jbenet @Kubuxu

jbenet commented 8 years ago

@noffle good stuff!

ghost commented 8 years ago

Yeah good stuff! :)

I'm concerned about adding git repos:

Does go get use git for anything apart from cloning? If there's nothing else, we could create a vanilla repo with only one commit containing exactly the files we want.

Another idea, how about git-remote-ipfs? :) https://github.com/cryptix/git-remote-ipfs This endeavour would be a great opportunity to nail first-class git repos on IPFS.

modify ipfs gateway to terminate the connection when receiving a non-HTTP header (GET / HTTP/1.1)

Can't reproduce with printf | nc or telnet -- could you file and issue in infrastructure.git with info? I'll get everything fixed that's in the way of a cool solution.

hackergrrl commented 8 years ago

Thanks for the comments!

We might lose one of gx's advantages: depending on an exact version of the dependency: https://stackoverflow.com/a/6311945 -- the solution to this might be git update-ref HEAD . We'll have to check how exactly go get fetches the repo, e.g. if it takes HEAD into account, or just always fetches master, etc.

We'll be publishing the repo at a specific version (the git repo will be frozen from time of publish), so all users will get the same git repo with the same HEAD as long as they use that hash.

go get might have to fetch much more data. Again check how go get fetches the repo, maybe it's smart enough to only fetch the git objects which it needs.

Agreed, that could hurt the experience. If it becomes a problem we can make the gx-go publish step just git init a new repo and dump the full repo's contents in as a single root commit. The important part is that it's a git repo (to satisfy go get) -- it doesn't need to maintain any other history / utility.

Another idea, how about git-remote-ipfs? :) https://github.com/cryptix/git-remote-ipfs

Really excited for git-remote-ipfs! (Though unless we get a patch into go core it won't try and use this method on get, right?)

could you file and issue in infrastructure.git with info? I'll get everything fixed that's in the way of a cool solution.

Yes, definitely. Thank you @lgierth!

Kubuxu commented 8 years ago

One issue I can see it that many people will go the path of least resistance meaning without gx install which removes distribution aspect of gx.

Instead of using local nodes, or even using ephemeral nodes people will just use ipfs.io which isn't optimal.

hackergrrl commented 8 years ago

@Kubuxu: that's right. However, it's a much better experience than receiving a page of errors. It'd be nice if we could have go get be distributed transparently, but falling back on a centralized solution and letting users "opt in" to gx's superpowers is I think the best we can do here.

Kubuxu commented 8 years ago

In my opinion the best solution would be to have a way to tell users what is wrong, why and how can they fix it but in lack of better alternative we will have to roll with this.

(Just knowing people, an opt-in feature that requires extra step and doesn't give visible improvable won't be used).

hackergrrl commented 8 years ago

In my opinion the best solution would be to have a way to tell users what is wrong, why and how can they fix

That'd be really nice to have: it'd be great if you could investigate it.

hackergrrl commented 8 years ago

Update: @lgierth and I looked at the go get public gateway blocker above and had some findings:

It looks like the git:// protocol will intentionally ask for certain files that it knows may/may not exist in the bare git repo, as part of its determination of how to proceed vs fallback to another protocol. However, we think that the gateway is returning 404s for these probe requests but multireq may be discarding the responses: it doesn't treat 404 as a valid response at the moment. Once this is in we'll see what terrible breakage we hit next. :)

Kubuxu commented 8 years ago

Best thing I can come up is abusing tokeniser and inserting string in quotes before package definition. It would be removed by gx-go while installing package.

It makes go build and other go tools spit out only:

1d [kubuxu@vs1:~/go-ipfs/cmd/ipfs] master(+2/-0,16) 1 ± go build
can't load package: package github.com/ipfs/go-ipfs/cmd/ipfs:
../../../go/src/github.com/ipfs/go-ipfs/cmd/ipfs/daemon.go:1:1: expected 'package', found 'STRING' "Using pure go get is no longer supported. See ipfs.io/.... for correct installation method or if you already cloned the repo use `make workspace`"
../../../go/src/github.com/ipfs/go-ipfs/cmd/ipfs/daemon.go:3:1: expected ';', found 'package'
whyrusleeping commented 8 years ago

might be worthwhile @noffle to take a look here: https://github.com/whyrusleeping/git-ipfs-rehost

whyrusleeping commented 8 years ago

@Kubuxu haha, i like that.

sheerun commented 8 years ago

Couldn't you just skip re-writing import paths and just put appropriate files in vendor, being compatible with GO15VENDOREXPERIMENT? (it goes live in go 1.7)

So you can still put following in source files:

import "example.org/repo"

And then issue gx install to discover all packages and lock them in package.json.

Now, if someone installs with go get, everything works; and if someone wants to install locked packages, she can issue gx install what will download pkgs locked in package.json into vendor

Kubuxu commented 8 years ago

It is like that in most other repos, not go-ipfs, but it means that using go get will install newest dependencies.

We don't want that, as in many cases dependency update will break some behaviour and user that installed go-ipfs might have different dependencies that we know of and he might experience different issue.

sheerun commented 8 years ago

Then maybe let ipfs daemon serve go-getable packages locally:

import ds "ipfs.local/ipfs/QmZ6A6P6AMo8SR3jXAwzTuSU6B9R2Y4eqW2yW9VvfUayDN/go-datastore"

The package will be only go-gettable if ipfs daemon is already running, and ipfs.local domain is pointing to correct local proxy endpoint that behaves the same as ipfs.io.

jbenet commented 8 years ago

why not just ipfs.io/ then, as was suggested by @noffle at the beginning?

ghost commented 8 years ago

I've been thinking about this again with whyrusleeping/gx#100 in mind, and I think it wouldn't be too hard to add a shallow bare git repo at /ipfs/Qmfoobar/mypkg.git, so that each package looks like this:

> tree $GOPATH/src/gx/Qmfoobar
Qmfoobar/
├── mypkg
│   └── [...]
├── mypkg.git
│   └── [...]
└── index.html

The index.html page could give you clone instructions :) git clone https://ipfs.io/ipfs/Qmfoobar/mypkg.git. It might even be okay to add the whole repo at the specific head, all the git stuff will be deduplicated.

Bonus points: the bare repo could have an index.html too, which just redirects to ... That way the import path in go directly translates to a browsable web page.

hackergrrl commented 8 years ago

Aw man, @lgierth -- I dig this! This way the current gx-go setup keeps working (just refer to Qmfoobar/mypkg), and go get importing works for free*!

whyrusleeping commented 8 years ago

Not quite, the ipfs gateway will need to respond to the go-get query that the go tool makes.

ghost commented 8 years ago

what does the go-get query look like? Maybe it even follows redirects and we can omit the .git suffix by redirecting there in case of the go-get query?

hackergrrl commented 8 years ago

@whyrusleeping sorry, could you give more context? What part of what I wrote are you responding to?

@lgierth a redirect would be great; we can dodge building any Go awareness into the gateway.

I saw https://github.com/golang/go/commit/932c8ddba158a91056eba87045bb6d5ddbeb39f7 but haven't dug into it enough yet to see if it's entirely relevant.

ghost commented 8 years ago

interesting, this sounds like we might just need /ipfs/Qmfoobar/mypkg/index.html with the respective meta tag?

ghost commented 8 years ago

and i figure this means we could also make go get ipfs.io work :P

paultag commented 8 years ago

Since ipfs.io isn't blessed like GitHub, users would need to use a ".git" suffix on their import path for it to work. Could we gx-go publish with mypkg.git paths and use gx-go rewrite to fix them on gx install?

You can write a meta tag go will read like:

<meta name="go-import" content="{{ .Path }} git {{ .Repo }}">

See, for example - https://pault.ag/go/debian (import name is "pault.ag/go/debian")

hackergrrl commented 8 years ago

Thanks @paultag! Between your example and the go docs I was able to put together a script that prepares the gx-go package in such a way:

#!/bin/bash

rm -rf /tmp/gx
mkdir -p /tmp/gx

# 1. do a 'gx publish' to get the hash
OUTPUT=$(gx publish -f)
HASH=$(echo $OUTPUT | cut -d ' ' -f 6)
PACKAGE=$(echo $OUTPUT | cut -d ' ' -f 2)

# 2. 'ipfs get' it into /tmp/gx as the package name
ipfs get $HASH -o /tmp/gx/$PACKAGE 2> /dev/null > /dev/null

# 3. do a shallow bare git clone
git clone file://$(pwd) --bare --depth 1 /tmp/gx/$PACKAGE/${PACKAGE}.git 2> /dev/null

# 4. add the git repo to ipfs to get its hash
GIT_HASH=$(ipfs add -qrw /tmp/gx/$PACKAGE/${PACKAGE}.git | tail -n 1)

# 5. generate index.html that has <meta> tag
cat << EOF > /tmp/gx/$PACKAGE/index.html
<!DOCTYPE html><html>
    <head>
        <meta charset="utf-8">
        <meta name="go-import" content="gx/ipfs/${GIT_HASH}/${PACKAGE} git https://ipfs.io/ipfs/${GIT_HASH}/${PACKAGE}">
    </head>
</html>
EOF

# 6. 'ipfs add -r' the whole thing
ipfs add -rq /tmp/gx/$PACKAGE | tail -n 1

This is 99% there, but with one unfortunate caveat: go get does a GET against the path https://ipfs.io/ipfs/HASH?go-get=1. Without a trailing slash / after HASH, the IPFS gateway will return 302 Found and not output the contents of index.html. I think we'd need to modify the gateway to get around this.

paultag commented 8 years ago

Looking forward to being able to go get ipfs again :) :+1:

whyrusleeping commented 8 years ago

lets figure out the gateway changes we need to make for this. cc @lgierth

paultag commented 7 years ago

@whyrusleeping @lgierth: Did anyone chip away at this? I'd love to subscribe to the gateway bug if there's one

ghost commented 7 years ago

ipfs/go-ipfs#3963 is a tiny step in the right direction, go get can now correctly parse the meta tag.

Next blocker: the index.html file can never know its own hash :( @noffle's script above includes the original package hash, but the tag and import will never match: https://github.com/golang/go/blob/32d42fb6ec5421f0c64fe7f7ffec0b9e7956e1ea/src/cmd/go/internal/get/vcs.go#L650-L722

@whyrusleeping did you ever ask whether someone in the go team would consider accepting a patch adding ipfs support? Maybe it'd be enough to stretch the matching rules a little bit.

ghost commented 7 years ago

ipfs/go-ipfs#4143 brings with it new origin-ized gateway URLs a la https://somehashinbase32.ipfs.link -- we should re-evaluate this index.html blocker in light of this.

ghost commented 7 years ago

Next blocker: the index.html file can never know its own hash :( @noffle's script above includes the original package hash, but the tag and import will never match:

We just chatted about this on IRC, actually it's not a blocker and pretty easily solved: if the request has ?go-get=1 set, we'll inject the requested hash into the index.html content. The gateway altering response content is problematic, but in this case it's okay because it's essentially hidden behind a feature flag that's only ever set by go's package tooling.

ghost commented 7 years ago

Here's what we should try:

For /ipfs/QmFoo/mypkg, the tag would look like this:

<meta name="go-import" content="dweb.link/ipfs/QmFoo/mypkg git https://dweb.link/ipfs/QmFoo/mypkg.git">
paultag commented 6 years ago

Any updates on this? It's been quite a while without go get working, can we implement a short-term fix while longer-term features are landed to make it better? Requiring users to install a custom tool to build software is hard to work around.