Minimal networking requirements for browser environments

Browsers, have become powerful and full featured runtimes, but due to various security considerations, browsers don't allow direct access to low level networking primitives such as sockets. This restriction makes common operations involving UDP or TCP impossible,for example, programmatic domain name resolution is not possible, as well as NAT techniques involving UDP based interactions, such as hole punching or standard UDP based protocols such as PMP and UpNP. It also restricts listening for direct connections, with the only exception being WebRTC, but even then, a direct connection is not guaranteed.

Nevertheless, this should not prevent browser's direct access to the devp2p protocols such as eth/*, les, pip and possibly others, and should allow non-browser clients to communicate directly with browser clients, such that browser clients are first class citizens in the underlying p2p network.

In addition, enabling browser runtimes would lower the bar of entry for other less capable devices, such as IoT. In this regard, this issue isn't constrained to browsers only, but should be used as a good baseline to target other runtimes.

For more context please take a look at https://github.com/ethereum/devp2p/issues/71 and in particular to https://github.com/ethereum/devp2p/issues/71#issuecomment-482796738.

The high level requirements for such a stack would be:

Allow multiple transport protocols. For example, it should be possible to use WS or WebRTC alongside TCP and UDP, to establish communications.
Allow a single client to communicate over multiple transport channels concurrently. A single client would be able to listen for WS, WebRTC and TCP connections at the same time.
Enable fallback mechanisms such as relays when a direct connection is not possible.

Some additional considerations brought up by @FrankSzendzielarz:

Drawing the line between minimum protocol requirements on Ethereum implementations and the toolsets (eg libp2p)
ENRs at the discovery level and how they drive multiaddr
Discv5 aes-gcm channel re-use for the eth protocol
Micropayment incentivization and relay-only nodes or implementation wrappers to make standard implementations support multiple transports

Was thinking about this today.

Given the following....

Some types of light client want to be able to participate in the discovery protocol (even though it might be suboptimal) and to do so the discovery protocol will need to operate over different transports.
There are other full nodes under development that operate under unusual network constraints, eg: UDP is blocked, TOR etc.
Discv5 also recommends certain implementation strategies for mitigating risks around validation of node information. When a Kademlia 'find neighbours' request results in a Neighbours message containing new information, those new ENRs need validation, with the how and when of that validation being a tricky task (see link).

….we need to consider these impacts:

If nodes heterogeneously implemented/enabled discovery transports, we'd have a situation where some nodes were unable to validate node information (because the newly discovered peer would be available on an unsupported transport).

In order for the node to prevent attacks and protect its own 'reputation' (reputation in quotes because the recommendation in the spec is more about limits on nodeids per IP per learned from source) , the node would have to make sure that it only accepts nodes it can validate.

However, this would lead to network partitioning . Let's say we have TransportX and TransportY (eg: UDP and WebRTC). Some nodes can only communication over TransportX. Some nodes only over TransportY. Some nodes communicate over both. Over time the network would evolve to resemble a Venn diagram, where the smaller number of bridge nodes would be in between.

This immediately raises some questions:

If the ENR is from a light client, should it be redistributed at all? For light clients, should the light ENR be omitted from the Kad DHT?
What happens if an intersection node passes an ENR containing id information for TransportX and Y to a node that only supports TransportY, for example? Does the recipient declare that the endpoint information cannot be validated for TransportX and reject the ENR? In this case, the subnetworks cannot be bridged.
If the recipient validates TransportX , cannot validate TransportY, but then redistributes the ENR back to a TransportY node, it is itself potentially involved in an attack where TransportX credibility is used to fake TransportY nodes to DoS a TransportY victim.
How does an intersection node carry out a lookup from a TransportY node? The incoming FindNode would need to to filter results to only return neighbours with TransportY ability, which could lead to lookup process stalling if the numbers are insufficient. In effect, we would have multiple parallel DHTs, but with the k-bucket sizes being restricted to single k. It would be more appropriate perhaps to treat each transport as its own Kademlia network, and nodes would need to maintain a per-transport DHT.

So, it begins to look like that if heterogenous transports are implemented, there should probably be per transport DHTs, with ENRs being transport/DHT-specific. (By DHT I mean a Kademlia system instance or routing table)

In that case though, implementations that support multiple transports would need to populate multiple DHTs. Assuming a common discovery protocol, a mechanism for doing would be an optimal re-use of keys/handshake established over one transport, with a FindNeigbours request over a common (eg: UDP) transport resulting in Neighbours responses for multiple DHTs. i.e A total of 2k neighbours received over UDP, with k being for UDP and k for WebRTC, for example. This approach would perhaps be an added benefit compared to parallel bootstrapping as it would allow WebRTC nodes to be discovered indirectly via the UDP network

The alternative to all the above would be to guarantee that implementations agree to maintain a commonly available set of transports.

Thoughts?

If nodes heterogeneously implemented/enabled discovery transports, we'd have a situation where some nodes were unable to validate node information (because the newly discovered peer would be available on an unsupported transport).

However, this would lead to network partitioning . Let's say we have TransportX and TransportY (eg: UDP and WebRTC). Some nodes can only communication over TransportX. Some nodes only over TransportY. Some nodes communicate over both. Over time the network would evolve to resemble a Venn diagram, where the smaller number of bridge nodes would be in between.

This are valid concerns and can be mitigated with something like circuit-relaying (https://github.com/libp2p/specs/blob/master/relay/README.md), this was the main requirement for libp2p when originally implemented.

As a separate questions (I'll create the issue, but want to bring up in this context as well), are there any real numbers/metrics on the current state of the network? Something that allows us to visualize its current state - detect partitions, number of unreachable peers (ie NATs), average response times, etc? I would say this is extremely valuable data that should be factored in the overall discv5 design.

This immediately raises some questions: ...

I'll try to elaborate on circuit relaying a bit here.

The relay is simply a node that relays traffic for other peers. This mitigates the issue of incompatible transports and also allows to further aid in negotiating a better channel overall. For example, if the ENR record for peer B only advertised TransportX, when peer A on TransportY tries to connect to it, it would fallback to a relayed connection, but if peer B now happens to be able to speak TransportY as well, they can both upgrade the channel and establish a direct connection.

This is even more pervasive with NATed connections, when a peer might advertise erroneous address because of multiple levels of NATs. In this case other peers would fail dialing it, a relay could provide a temporary fallback channel, where a dedicated channel is negotiated. This is the intent behind the proposed dial-me protocol (https://github.com/libp2p/specs/pull/64).

I want to emphasize that a relay does not have to be a centralized service, in the libp2p stack any peer can be a relay.

So, it begins to look like that if heterogenous transports are implemented, there should probably be per transport DHTs, with ENRs being transport/DHT-specific. (By DHT I mean a Kademlia system instance or routing table)

I want to emphasize that this problem is not limited to different transports, this is a problem that will exist no matter how heterogeneous the transports are, simply because clients and in particular clients that run on consumer hardware are going to be exposed to widely different network topologies, and in this context the only viable solution is relays.

Also, transport locking IMHO, is not very different from the arguments I've heard with regards to storage, where one of the main requirements is to keep the client consumer hardware friendly as to avoid lockout and centralization issues. In this context, network is not much different.

In that case though, implementations that support multiple transports would need to populate multiple DHTs. Assuming a common discovery protocol, a mechanism for doing would be an optimal re-use of keys/handshake established over one transport, with a FindNeigbours request over a common (eg: UDP) transport resulting in Neighbours responses for multiple DHTs. i.e A total of 2k neighbours received over UDP, with k being for UDP and k for WebRTC, for example. This approach would perhaps be an added benefit compared to parallel bootstrapping as it would allow WebRTC nodes to be discovered indirectly via the UDP network

I'm not too keen on the idea of multiple DHTs, this would for sure lead to de-facto network fragmentation, IMO.

What I would propose is that each layer of the stack, discovery included, is agnostic of the other, and discovery, DHT specifically runs on top of any existing channel. This is possible with a properly multiplexed connection.

I'm not sure what this issue is really about because there are so many suggestions about potential directions. It would be nice to have a solution for browser access to LES for example, but this issue goes way beyond that and proposes that every protocol we define should somehow be accessible from the browser. We won't have that, and this is why I'm closing this issue now.

See #166 for a narrower issue which is specifically about LES in the browser.

ethereum / devp2p

Minimal networking requirements for browser environments #87