peer-to-peer communication between equal nodes

Logan007 commented 4 years ago

A more peer-to-peer-like approach was discussed at several occasions. I have had some thoughts about it I want to share as a discussion starter. I am aware that there are flaws and still problems to solve.

peer-to-peer would require that all nodes are equal, not just treated equally but also provide equal functionalities. Hence, supernode and edge would need to merge to just a node.
each node can support more communities (as the supernodes do now). However, each node can only have one (optional) IP outlet (tun/tap) which is connected to one of the communities. Only this makes the node accept PACKETs.

The following thoughts mostly are per-community (each community requires their own REG/ACK/node-list/anchor nodes):

nodes register to each other using REGISTER and ACK, there are no special _REGISTERSUPER anymore. The content of ACK must allow the receiver to detect if it also indicates a successfully established peer-to-peer connection.
some nodes are considered anchor nodes. They usually have a fixed public IP address. Their IP addresses are provided to each node at start-up (just like the supernode list now). Even anchor nodes themselves can be provided with IP addresses of other anchor nodes. I call them anchor nodes because they serve as anchor to get back to the network in case all other connections get lost and cannot be re-established.
anchor nodes are not special in their functionality. The other nodes just mark them internally as anchor nodes and never purge them from their internal node list.
a node must keep a list of the other nodes known. On the one hand, for easier scaling to bigger networks, this list must be purged from time to time. On the other hand, information maybe required later-on for sending packets is lost by purging. So, there must be a good compromise. Perhaps, a certain percentage of max-seen number of nodes just does not get purged (keep some in addition to the non-purgable anchor nodes).
each node does not only keep track of other nodes known, but also about the number of other nodes (that's easy, just count through the list). Also, a node should announce the number of known nodes (maybe within REG and ACK), so each node knows about how many nodes its neighbor knows (an additional field in the nodes list).
The number of known nodes can be a good basis to decide which nodes to purge (purge those with less known nodes and keep the ones that seem to have more connections).
Each node (A) registering with another node (B) by a REG should be introduced to another locally known node (C). This information (including the sock?) gets sent back (to A) along with the ACK and could be considered (by A) as a request to register with that node (C). If (A) successfully registers with (C) then, another introduction happens and so forth. To break this chain, the forwarded REGs need a TTL. Or generally speaking, the initial REG could already carry a TTL which gets counted down following the sketched chain. Optimum TTL would depend on maximum number of nodes seen so far. Which node to introduce? The one with the most or least connections? A randomly chosen one but taking the number of connections into account as probability?
The anchor nodes will have a high number of neighbors (because every node will try to initially REGister with one of them) so they will still play a central role. If they however become defective, the other nodes will compensate. This happens automatically (even if they do no get purged) by including the "last_seen" field into decision where to send an unknown packet / query to. This field gets updated by regular re-REGisters (with the non-purged).
All timers for re-REGisters and purging could get a small +/- random part e.g. 17 +/- 3 seconds for re-REGisters to distribute the load a bit over time.

So far, so good, But… here is where it gets beyond my network knowledge and imagination:

When forwarding/routing packets or broadcasting to unknown or all destinations, it will get a bit uncertain. If no node has full overview of the network, there must be some clever forwarding and routing strategies to

make sure a node can certainly be found if present
make sure that all nodes can be reached for a broadcast without double-sends
ensure that network does not get segmented, i.e. breaks in distinct parts.

Item no. 3 cannot happen as long as at least one anchor node is available. If all anchor nodes get lost, there should still be a good chance for the network to survive (resilience) as long as a minimum number of nodes remains.

Items no. 1 and 2 need to be addressed with some more thoughts. That might include automatic MAC-address assignment by the help of the first node registered to (for some clever routing strategy somehow related to the MAC-addresses – key-based routing?).

P. S. I found Pastry, Kademlia and Batman somewhat inspirational.

Logan007 commented 4 years ago

Actually, I the ideas of Kademlia's routing based on the XORed node ID-metrics appeal to me and I would mark them as "quite interesting". It surely will work with 48-bit MAC addresses as well. Not so sure about the purge concept in case of full k-buckets.

On the other hand, the Kademlia approach would probably produce some overhead in terms of network-traffic to find nodes and has higher memory requirements for node lists / buckets. Nothing comes for free.

Especially broadcasts need to be addressed. The approach outlined in Efficent Broadcast in Structured P2P Networks as well as section 4 of P2P based intrusion detection might help.

fengdaolong commented 4 years ago

Thank you for putting forward the point-to-point realization theory. I am busy with my work and haven't been here to follow n2n for a long time. However, no one has followed up and participated in the discussion. Maybe it is because it is too esoteric or because it has few followers. But I believe that the direction of implementation is correct, it may take a huge amount of work to implement it, and a test environment with multiple nodes must be provided to test it.

Kademlia is a very clever protocol. Using the Kademlia protocol to achieve decentralization will be great, but there are several problems that cannot be avoided:

As you said, when there are a lot of nodes, the network overhead generated by Kademlia in finding nodes cannot be ignored, and it will even affect the throughput of network transmission;
Due to the complexity of the network and the network restriction methods of IDC service providers, after using Kademlia to decentralize, the success rate of NAT point-to-point penetration may be reduced or become unstable;
Since many IDC service providers have banned popular P2P protocols, encryption or masquerading may be necessary at the protocol level to prevent blocked ports or network traffic.

Logan007 commented 4 years ago

I am not sure if you are able to follow the discussion on n2n's Discord channel. It seems it will take a step-wise approach. First (already huge) step is to add a resilient multi-supernode support. Very promising.

I am well aware that the opening post's elaborations are more of a vision rather than tomorrow's implementation…

By the way, I thought more of Kademlia's routing concept than the complete protocol. Just using the metrics inside n2n to know where to send the packets to in case of broadcast and look-ups. The rest would remain pretty much n2n. So, it would still be the hole punching by the help of some publicly accessible hole-punching-helper.

fengdaolong commented 4 years ago

I fully support your idea and let us work together to realize it.

Logan007 commented 3 years ago

I looked a bit deeper into Kademlia and want to put down some more or less related thoughts before I forget again:

With its native node_lookup, it seems well suited for, well, node lookup.

But how about other functionalities of the supernode, which need to be taken over by the p2p network?

Concerning the hole punching, the supernode serves as "ping partner" to keep the edge's NATed port open (re-registration every 20 sec). That can easily be taken over by some other peer.
In case of restricted NAT, the supernodes serve as "receiver proxy" because restricted NAT does not allow incoming traffic from other nodes if it was not initiated by edge side. In a full p2p scenario, a node or the p2p-network must be able to detect this situation, chose a receiver proxy from the other nodes, and propagate this information along with its own data. I have no idea yet how this could look like. Maybe the first other node adds the socket and others not able to connect fall back to the next-node's socket?
In case p2p fails, the pSp path of packets can be replaced by a p2p based forwarding by just following the Kademlia routing table. It will work (not with nodes under restricted NAT), but it will be far from optimal path and could even involve nodes that are considered to have a more edge/endpoint character. So, some nodes should be marked as "preferred for routing", maybe a special command line parameter. They would probably correspond to the supernodes in current architecture. But how does traditional routing go along with the Kademlia approach? How can we define preferred routes and still be able to fall back to the pure p2p way if other routes fail?

ntop / n2n

peer-to-peer communication between equal nodes #405