Closed cboddy closed 5 years ago
FYI this link appears to be broken https://github.com/ipfs/community/blob/master/contributing.md#contribution-guidelines
Thanks! You should be able to add a sharness test which hijacks the request the request using netcat as is done in https://github.com/ipfs/go-ipfs/blob/master/test/sharness/t0235-cli-request.sh#L14.
@magik6k thanks for the swift feedback; have started and will munge that script into an sharness test as suggested.
So, I believe this should actually be using the ReverseProxy helper. That will copy and modify all the right headers and handle all the edge-cases. However, that may just make it more complicated so take this as a suggestion. (You won't be able to use the NewSingleHostReverseProxy
helper, you'll have to implement a custom RoundTripper
.)
I believe we also need to read while writing. That is, the current version expects to write the entire request before it reads any response. We should start the request and then read the response as some worker finishes writing the request.
Note: We can probably fix these later if they turn out to be difficult to get right.
@cboddy @Stebalien this might be a little late in the PR cycle to suggest but, if I understand correctly, @hsanjuan has built go-libp2p-http
for this use case. It should simplify the implementation.
Hello, yes, https://github.com/hsanjuan/go-libp2p-http will probably simplify this (along with httputil.ReverseProxy
). go-libp2p-http provides the libp2p transport (RoundTripper
implementation) that you are looking for the ReverseProxy
. If I'm correct you just need to initialize it with libp2p://peerid
as target URL and manually set the transport. At least, it would not add another http RoundTripper to the libp2p world.
The RoundTrip
in go-libp2p-http is simplified if you compare it to all the default transport's implementation (https://golang.org/src/net/http/transport.go?s=3628:10127#L385) but it's well tested and I think it should be solid enough (used by cluster for a while). It supports reading responses while sending the requests (for streaming requests).
The only problem would be to attach a custom transport tag to every Roundtripper. We can easily enable this upstream (currently a single tag is used for all), if you are willing to go down this path.
@Stebalien @lanzafame thanks for the feedback.
Will look into using ReverseProxy
as suggested (most of the go stdlib still a bit new to me) and try to push something today/ in the next few days.
fyi, protocol tags per Roundtripper coming about: https://github.com/hsanjuan/go-libp2p-http/pull/12
One more thing we'll need to do before merging this: put it behind an experimental feature-gate. Take a look at how the ipfs p2p
command does this (using config.Experimental.Libp2pStreamMounting
).
I don't really see this needing a long stabilization period, but we should probably leave it experimental at last until we stabilize the ipfs p2p
feature...
I've tested out all our api calls for Peergos through this, which includes gets, posts, and multiparts. And everything works!
@hsanjuan We can't use the RoundTripper in go-libp2p-http because it assumes a different incoming request format, and we don't want to do an extra socket+http round trip. We used your technique to close the stream though.
@Stebalien Does that mean we need a PR to go-ipfs-config to add a new config option and then gx publish that before this can be merged, or can we use the existing Libp2pStreamMounting option?
@hsanjuan We can't use the RoundTripper in go-libp2p-http because it assumes a different incoming request format, and we don't want to do an extra socket+http round trip. We used your technique to close the stream though.
That's not true. You just need to rewrite the path, which you are already doing anyway.
I can plug in libp2p-http for you if that's the only problem.
@hsanjuan If we used go-libp2p-http would that not imply an extra http round trip? Here we receive the incoming request and directly write it to the p2p stream. But my understanding (correct me if I'm wrong - I'm not a Gopher) is that if we rewrote it to use go-libp2p-http we would then be passing through another socket and http parser before getting to the p2p stream?
@hsanjuan If we used go-libp2p-http would that not imply an extra http round trip? Here we receive the incoming request and directly write it to the p2p stream. But my understanding (correct me if I'm wrong - I'm not a Gopher) is that if we rewrote it to use go-libp2p-http we would then be passing through another socket and http parser before getting to the p2p stream?
@ianopolous no, what go-libp2p-http does is exactly what you are doing here. The only difference is that it works with a standard go http client by expecting urls in a way that the standard http library can handle so that you can use a the normal Go's http.Client
with both http and http-under-libp2p.
@Stebalien Does that mean we need a PR to go-ipfs-config to add a new config option and then gx publish that before this can be merged, or can we use the existing Libp2pStreamMounting option?
Let's add a new experiment, it's a pretty orthogonal feature (only using stream mounting on the client). I'll handle the gx update (yeah, I know it's a bit of a pain).
@hsanjuan mind making a PR against this PR? That'll make it easier to see what the change entails.
@Stebalien here: https://github.com/cboddy/go-ipfs/pull/4/files
By the way, let's not forget to document this. A section in the https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md doc. I don't know where it should be documented once it's not experimental though (here probably: https://docs.ipfs.io/reference/api/http/ ?).
This is waiting for a gx bubble up from https://github.com/ipfs/go-ipfs-config now.
@hsanjuan Do you know if this build issue is a gx import issue or something else? (It builds fine locally) https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Fgo-ipfs/detail/PR-5526/3/pipeline/
core/corehttp/proxy.go:36:43: cannot use ipfsNode.P2P.PeerHost (type "gx/ipfs/QmQKod7iLxQK6X3aFYvhDDnFdXo3QjxKeL2F7UrPtdKQR2/go-libp2p-host".Host) as type "gx/ipfs/QmeA5hsqgLryvkeyqeQdvGDqurLkYi3XEPLZP3pzuBJXh2/go-libp2p-host".Host in argument to p2phttp.NewTransport:
Check out: https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Fgo-ipfs/detail/PR-5526/7/pipeline/16
You have some duplicate gx dependencies imported which would explain the problem you are having.
@ianopolous you'll need to rebase this to go-ipfs/master and then gx update go-libp2p-http github.com/hsanjuan/go-libp2p-http
.
rebased this onto master..
@hsanjuan I've tried running "gx update go-libp2p-http github.com/hsanjuan/go-libp2p-http" after rebasing, but get: updating go-libp2p-http to version 1.1.4 (Qmcb1uGjGVgU6riTwg9HvaSq3TtpDdF95Md3TogbcTRyag) ERROR: update sanity check: package client_golang (QmYYv3QFnfQbiwmi1tpkgKF8o4xFnZoBrvpupTiGJwL9nH) not found
maybe you needed to do gx install --global
first? @ianopolous
@Stebalien How do we fix the dupes: https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Fgo-ipfs/detail/PR-5526/14/pipeline Is it just that the latest versions of things need to be gx bubbled up through all the things that use them?
@ianopolous I think you need to rebase in master and re-update go-libp2p-http to the latest (which uses libp2p 6.0.19 in line with the rest of go-ipfs/master)
Ok I've rebased and gx updated for the third time. Hopefully we can merge this soon.
@Kubuxu or anyone: Any chance you could give me a hand fixing gx et al to get this merge-able? Many thanks.
@ianopolous If still doesn't work, if you can give me write access to cboddy
I can try to fix the branch.
Note: Don't bother with rebasing when you're done, I'll handle that (I'm about to pull though a massive gx update...).
Thanks @Stebalien @hsanjuan , yes I noticed last night that the multi-part test seems to fail at a rate of about 1 in 10 attempts.
So far it appears that when it fails it is because the nc
server receives a truncated request (everything after the first mutipart-boundary is missing). Not 100% sure what is going on yet; looking into it now.
I need to bubble libp2p-6.0.20 to go-libp2p-http. Will do this tomorrow.
So far it appears that when it fails it is because the nc server receives a truncated request (everything after the first mutipart-boundary is missing). Not 100% sure what is going on yet; looking into it now.
Oh. that issue. Yeah, we hit that in our other tests as well. I believe we just need to implement a custom tool for testing HTTP things in go.
I have a PR here to cleanup the tests a bit: https://github.com/cboddy/go-ipfs/pull/5. This uses IPTB and ensures configures the test nodes to not connect to the rest of the network (i.e., don't enable discovery, don't bootstrap).
So, we have a bit of an issue. Go's ServeMux automatically "cleans up" URLs so it'll helpfully:
Even if we choose a different escaping scheme for protocols, go will still end up rewriting the rest of the URL for us and there's no way to avoid that (as far as I can tell).
Options:
/http/proxy?peer=xyz&proto=/libp2p-http&path=/a/b/c
. This breaks browsing web pages but, really, we're using this to proxy requests.I'm pretty keen not to break browsing a website. That is our second most important use case after a single request. I also hope to use it personally to host arbitrary websites without requiring dns or indeed a public server.
@Stebalien Can you give some concrete examples where the encoding is a problem? I'd still fallback to restricting the character set for protocol names, seems like you're set on allowing "/"'s so maybe ban "-"'s (or some other non slash chartacter) and then we can encode them that way?
Hmm it's worse than I thought, if I type the following into a browser it undecodes, before even sending the request! So this: localhost:5001/proxy/http/QmaCZqbmBP5u2435pmBT8x5Ks6ZiWNg66ET4ctZNKmw78K/%2fx%2ftest/ becomes localhost:5001/proxy/http/QmaCZqbmBP5u2435pmBT8x5Ks6ZiWNg66ET4ctZNKmw78K/x/test/
which clearly doesn't work for protocols with "/"'s in them
@Stebalien thanks for refactoring the tests. Which tag of iptb
are you using/is expected? The flags used don't seem compatible with iptb
's HEAD.
@Stebalien I've reverted the url decoding because it isn't going to work. Can we merge this and move the protocol name discussion to another pr?
Which tag of iptb are you using/is expected?
See: https://github.com/ipfs/go-ipfs/issues/5521. I'm using 1.3.19
I've reverted the url decoding because it isn't going to work. Can we merge this and move the protocol name discussion to another pr?
It's worse than that, unfortunately, as we'll also redirect/rewrite the http path. We're in a release freeze at the moment so we can think this over for a few days. I'm find merging something that gets us most of the way there but I prefer not merging API changes that we intend to break in the immediate future.
@Stebalien How long until the next release? I was super hoping to make this one. :-(
I'm find merging something that gets us most of the way there but I prefer not merging API changes that we intend to break in the immediate future.
This version only works for protocol names without /'s. Unless we remove the protocol from the path entirely, then this shouldn't change for such protocol names. All we would be doing is expanding support to include more protocol names.
So, thinking about hosting websites from this API for a minute, I realized something rather important: We can't expose this from the API for security reasons (XSS).
That means we can either:
Personally, I think exposing it on the gateway will be the easiest.
WRT encoding, names, etc. I'd like to propose a solution that I think will make everyone happy:
/p2p/$ID[/x/$PROTOCOL]/(http|ws)[$PATH]
/x/$PROTOCOL/(http|ws)
./libp2p-ws
or /libp2p-http
(@hsanjuan any chance we can change these to /http
and /ws
? That is, make cluster also listen on /http
?).Examples:
Rational:
libp2p://Qm...
and libp2p://$PROTOCOL.Qm....
to preserve origins. Really, slashes are always going to be an issue for us if we want to support cases like this.@cboddy @hsanjuan ~@hacdias~ (sorry) @ianopolous opinions?
Regardless of how it may seem, I really don't want to draw this out indefinitely. I'm fine merging this as-is (on a custom port or as a part of the gateway, behind a feature flag) as long as we agree to revisit this and work under the understanding that the interface is unstable and will likely change.
@Stebalien Thanks for your feedback.
I was planning to expose it on the gateway as well, in a separate pr. So moving it there is fine by me.
I like the proposed path redesign. We don't care about the specific protocol name and will only ever use one so having the option to omit it is nice. To be clear though, doesn't this imply you are banning using /http in the (explicit) protocol name, as you are using that to determine the end of the name in the path?
WRT encoding, names, etc. I'd like to propose a solution that I think will make everyone happy:
/p2p/$ID[/x/$PROTOCOL]/(http|ws)[$PATH]
This means we need an additional libp2p-ws
for websocket proxying?
@hsanjuan any chance we can change these to
/http
and/ws
? That is, make cluster also listen on/http
?).
That should be no problem.
LGTM. Good catch with the XSS, it was rather obvious and we almost missed it :/
To be clear though, doesn't this imply you are banning using /http in the (explicit) protocol name, as you are using that to determine the end of the name in the path?
Actually, I'm banning slashes entirely. I wonder if that's a bit confusing... Basically, I'm saying that protocols must be of the form /x/CUSTOM/http
or just /http
. The idea is that custom applications would listen on /x/my-app/http
(or /x/my-app/ws
).
Eventually I expect this'll be done through a libp2p daemon and applications will (a) register sub-peer and then (b) listen on /ws
or /http
. But we're not there yet (except for custom libp2p apps like cluster).
This means we need an additional libp2p-ws for websocket proxying?
Someday.
Actually, I'm banning slashes entirely
It's fine. This is just for applications that want to be accessible by default using a default reverse proxy exposed in the gateway endpoint of the go-ipfs daemon. It's a very specific usecase for which an extra protocol handler that satisfies the limitations can be just added easily. You can always manually mount these streams with whatever custom protocol in the worst case (now and in a future libp2p daemon).
@Stebalien I've made those changes to the path spec, and moved the end point to the gateway.
This implements an http-proxy over p2p-streams, for context see https://github.com/ipfs/go-ipfs/issues/5341.
This script is a useful test of the functionality. In case it causes portability issues I've not included it as a sharness test (since it uses python to serve HTTP content, although happy to add it since python be ~ as available as bash).
(inline since GH doesn't support
*.sh
as an attachment)