libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.59k stars 276 forks source link

IP over libp2p #626

Open vyzo opened 3 months ago

vyzo commented 3 months ago

I want to gauge interest for developing protocols/specs for implementing IP over libp2p tunneling.

Motivation

There are use cases where private overlay networks are implemented using libp2p, yet there is a need to provide an IP network abstraction (e.g. using a tun interface) to processes running in the overlay network's userland.

For an example, in the nunet network the substrate is built using libp2p, but the applications running on top of this substrate are note aware of libp2p and instead rely on the (usual) TCP/IP interface abstractions. Unfortunately, with the current suite of libp2p protocols is quite hard to get acceptable performance because IP packets have to go over reliable transport with multiplexing. See https://gitlab.com/nunet/device-management-service/-/merge_requests/362#note_2040757665 for performance analysis of the PoC in nunet. In short, it is to say politely not great -- at best "only" a 2x slower, which could well be 10x slower in many cases.

Call For Action

So what do we need to do? I want to specifically implement and specify protocol integrations for IP over libp2p. I am pretty sure other projects are interested in this too, but let's get the discussion started.

I think the main difficulty is the lack of packet transports in libp2p; we end up sending IP packets over reliable transport, with mulitplexing, which completely defeats the purpose as the protocols running on top in userspace are prepared to handle packet loss and expect an unreliable network mechanism underneath.

So in order to resolve the problem of "IP over libp2p" we also need to develop appropriate packet transports, that let you send unreliable datagrams over a connection. QUIC already supports such functionality, so in a sense it is a matter of exposing it. This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

vyzo commented 3 months ago

Related Work: RFC 9484: Proxying IP in HTTP

marten-seemann commented 3 months ago

Related Work: RFC 9484: Proxying IP in HTTP

To add a little bit more context, CONNECT-IP is (conceptually and implementation-wise) very similar to CONNECT-UDP (RFC 9298), which is deployed on a massive scale by iCloud Private Relay (for example). I have an implementation of CONNECT-UDP in masque-go.

This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

The problem is that in addition to an unreliable way of sending data (for UDP or IP packets), you usually also want to have a (reliable) control channel. CONNECT-(IP/UDP) uses (reliable) HTTP streams to communicate where and how packets should be proxied, and then uses (unreliable) HTTP DATAGRAMs for the actual data transfer. If you use DTLS, you'd probably have to build some kind of reliability mechanism..

derrandz commented 3 months ago

There is kcp which offers reliability over UDP using some partical ARP protocol + packet level anonymization through encryption.

I attempted an initial implementation of kcp for libp2p in https://github.com/libp2p/go-libp2p/pull/2672/files if we want to revisit that.

vyzo commented 3 months ago

Well, reliability is actually a misfeature here; the userspace expects the network to be datagrams to be delivered unreliably, as best effort, and have their own mechanisms for end to end reliability where appropriate.

MarcoPolo commented 3 months ago

Thanks for opening this vyzo. I too am interested in seeing the interest here. I've never been opposed to providing a packet transport, but I've been hesitant to do so without sufficient interest and actual use cases.

Linking RFC 9484 is a good call. There are some subtleties here (e.g. IPv6 has requires a minimum MTU of 1280 bytes, but we'll have some overhead in our encapsulation...), and it's nice to reference a well thought out document that's already walked this path.

I wonder if we should just use RFC 9484. We can easily run an h3 server anywhere we have QUIC deployed. If needed, I think we could even support HTTP/2 with our TCP+TLS stack, however I'd push back on since it doesn't avoid the nested congestion control problem you get with doing IP over a reliable transport. Are there any usecases that wouldn't work if we did this per RFC 9484?

This will also allows to use a plain UDP based protocol without QUIC complexity, but we still need to resolve encryption and authentication, possible with DTLS.

I disagree with the premise that QUIC introduces needless complexity. We should not reinvent the wheel here. To echo Marten's statement, QUIC gives us exactly what we need here: unreliable datagrams and reliable streams.

vyzo commented 3 months ago

Agreed.

After consideration, I think the right way forward is QUIC unreliable datagram streams + the suite of protocols from RFC 3484

I am not opposed to http3 all that much, but i would prefer to do it purely with QUIC.

marten-seemann commented 3 months ago

For datagrams, HTTP/3 just provides a super thin (one integer thin, that is) wrapper around QUIC datagram, see https://www.rfc-editor.org/rfc/rfc9297.html#name-http-3-datagrams. This allows multiplexing multiple datagram flows (i.e. multiple proxied connections, in our case) in a single QUIC connection.

Having HTTP/3 for streams is convenient, since it provides an easy way for the client to send a proxy request: it's just an Extended CONNECT HTTP request, and for the server to respond to that request using HTTP status codes (and HTTP header fields). Really, there isn't any more HTTP than this going on.

If you wanted, you could of course use Protobufs for the request-response part of the protocol, and define your own datagram demultiplexing wire format, and that would work equally well. But you'd basically just reinvent the wheel.

vyzo commented 3 months ago

The concern i have with http3 is that we expand the dependency set to include an http3 server.

I would like to avoid this if possible.

vyzo commented 3 months ago

But maybe it's not a big deal.

MarcoPolo commented 3 months ago

I think leveraging h3 would be very useful for this for a couple of reasons:

vyzo commented 3 months ago

ok, fair enough.