yggdrasil-network / yggdrasil-go

An experiment in scalable routing as an encrypted IPv6 overlay network
https://yggdrasil-network.github.io
Other
3.47k stars 238 forks source link

Long-term goals/ideas #540

Open Arceliar opened 5 years ago

Arceliar commented 5 years ago

Sorry, this text is a rambling mess. I just wanted to make a public note of some things while I'm thinking of them, and I'll reorganize my thoughts into a cleaner format later after I've had time to think through them better. The tl;dr version is I'm trying to figure out what we need to change, and how to change it, if we wanted to be able to make the imagery of Yggdrasil linking worlds together be technically feasible in a very literal sense, even when subject to interplanetary latency. More practically, these would address basically the same sets of problems that make mobility an issue.

The current routing approach in ygg requires us to exchange coords end-to-end before traffic can flow, which depends on things being low latency. This also applies to keeping the DHT in a working state, which is needed for the searches (to learn coords for sessions) to work in the first place. Mobility breaks every part of this. So basically, ygg works in a "dynamic" network, but only if the rate of change is very low compared to some hard-coded update intervals and end-to-end delay.

This is brainstorming on possible things we could change to make things less latency sensitive (some backwards incompatible, so it'd be tedious but maybe possible to make provide an upgrade path). Ideally, if half of the network was on Mars, things should still Just Work end-to-end. I haven't thought through all of these, so I'm sure there are complications that would show up or flaws in my reasoning.

  1. It's really important that we be able to prove that the DHT always converges. Currently, it seems to, but I don't know that it does. Fixing it could theoretically be as simple as "everybody includes the root in their DHT", but that could spam the root, so I'd like to use something a little more local.

  2. Addressing / NodeIDs. We could get rid of the split between NodeID and TreeID, and make everything based on what we currently call TreeID. Nodes could then sign their coords (the final message from the long chain of signatures) and include a public key which is the default key to use when contacting them (which should still tend to be permanent, but its not the way we identify nodes now, and a paranoid node could restart with a random key each time, at the expense of being harder to contact from very high latency nodes). Now we have signed proofs of where a node was in the network at some time (we maybe want to add a sequence number / timestamp to every hop of the signed tree messages too, not just the root's for other reasons explained later). So now we can exchange these signed proofs instead of directly contacting nodes in the DHT etc. This would mean the DHT needs no keep-alive traffic except for periodically updating the proofs and timing out if they don't receive a new update for too long. That's much better than the round trip ping/pong approach if we have very high latency. We could technically also sign with curve keys instead, and people would get to keep their current addresses as a result, but I think it's better long term to use something optimized for signatures. However, for long-term survival, we need to introduce support for multiple key types (and hash functions) anyway, so we can upgrade things if/when we need to switch to post-quantum encryption. So it would make sense to do that, then add support for the signature-based scheme, then drop support for the encryption-based scheme after a while, to give people time to migrate things to the new addresses.

  3. DHT subtrees. For the DHT to work, we need to keep information updated for our immediate predecessor and successor in the ring (and ideally maybe 1 extra hop in either direction, so we have a backup when 1 person leaves). But we can consider a subtree rooted at any particular node, and do the same. When searching, instead of immediately jumping to the closest node in keyspace, start in the smallest possible subtree (rooted at yourself) and walk around the ring to the closest node in the current tree, then move up to the parent tree when that reaches a dead end. If half the network is on Mars, and all of Mars is in its own subtree, then lookups by someone on Mars of someone else on Mars will stay on Mars and not pingpong back and forth between Earth, which is what would currently happen. We could probably do most of this already, but it probably wouldn't actually benefit anything on its own, and most of the logic would need to be rewritten anyway if/when we switch to a signature-based approach, so it's probably best to leave it for after.

  4. Sessions: use something Noise-based or at least more OTR-like. Allow things to start working with permanent keys, and then rotate keys via session pings in the background. Start advertising your intended next key to send from, and then switch to it when you receive a pong acknowledging the change (and possibly updating which remote pubkey you should use when sending traffic too). This allows traffic to start flowing without needing to wait for an initial key exchange, as long as you know their key and where to send it. Speaking of where to send it...

  5. Forwardng: It would be nice if we could start forwarding immediately, when we only know a node's key, by routing it through the DHT. Or more generally, we specify the destination and then set a "relay point" of (key?/)ID+coords, of whatever node we know that's closest to the destination (according to the usual subtree rules?). When routing the traffic (including if it reaches the relay point), any node that sees a better relay point can update the point and forward further. This would cause traffic to wander each subtree a little bit until it figures out where it needs to go, but it means packets could start flowing immediately when you figure out the destination's key. Then you could do a DHT lookup in the background to figure out where the destination is.

  6. Mobility: It would be nice if, when a node moves, traffic directed to its old location would still reach it eventually. Something related to the above Forwarding step is maybe enough... when we reach a dead end as far as the tree is concerned, it starts forwarding again like the above (by subtree?). So, in the absence of congestion, disconnects, or packet loss on the wire, a packet drops when it reaches a DHT dead-end, not a tree dead-end. If the local subtree DHTs can be updated much more quickly than the end-to-end session, then this means packets will still reach their destination in most cases. If a node goes offline, they'll get redirected to the node's keyspace neighbor, and may wander around a bit (back to the node's old subnet) before the neighbor realizes that they've timed out and removes the node from its DHT.

  7. Concerns: if the DHT is based on timed out messages, then we need to save the highest message ever of nodes that have disconnected, so we don't pay attention to old locations if they're re-sent. Like with the root timestamps, we could drop these if we ever find a better node (within the same subtree? need to figure out when exactly it's safe to drop, since we could move subtrees too).

  8. disruption-tolerant networking: It's possible that no end-to-end path exists at all if nodes are very mobile and connectivity is sparse. In which case, the forwarding/mobility logic may be able to cache packets at some point, and resume forwarding when connectivity is restored. Note that when I say "packets" I'm thinking about current ygg, but in principle it could be arbitrary (possibly much larger) messages (if things are store-and-forward).

  9. link-level message multiplexing: currently we send all the bytes of 1 message (an encrypted IPv6 packet w/ ygg headers) consecutively over the (tcp) connection. It would make sense to start sending chunks of a packet, so we can allow nodes to send larger messages without tying up the line indefinitely. We just need to be careful how we coordinate this with when we notify the switch that we're ready for more traffic. We need to make sure that we don't keep asking the switch for more messages while making tivial progress on any message in particular, but we also ideally don't want to just attach a hard limit on the number of concurrently sending messages (that's DoSable), so we need some kind of feedback mechanism. Possible idea: add the new packet to the head of a list, then notify ourself (actor message) to send. Each send notification has us traverse the list, starting at the head, send 1 chunk from each message on the list (removing items from the list if they're empty), notify the switch if the switch hasn't already been notified that we're ready for a new packet, and then notify ourself to keep sending if the list is still non-empty. We would probably need to write chunks into some buffer, and then flush the buffer if the list is empty and the switch has notified us that it's put us into an idle state (to wait for more traffic -- the switch currently tracks who is idle, but doesn't notify them that they're idle, so we'd need to add that).

  10. 353 already describes this, but for mobility to work, we need to rewrite coord while packets are en-route in some cases. Specifically, a node's coords are a source route from the root to the node. When we route a packet, if we see that the sender's is an ancestor of the destination, and the next-hop after the sender is our sender's port to us, but we're not an ancestor of the destination, then we should rewrite the coords on the packet so the sender's next hop is replaced by our current coords. That just updates the previous parts of the source route from the root to ourself to match what our current preferred path is, so that our descendants can handle the packet correctly. This is something we can add support for and benefit from immediately, with no changes to the protocol, just by rewriting the switch logic (which we need to do anyway, since it's only half-migrated to the actor model, as of writing). So this is low hanging fruit as far as the ideas listed here are concerned.

zhoreeq commented 4 years ago

Can Noise protocol be used as a transport for Yggdrasil?

It has something for UDP, issue #345 https://noiseprotocol.org/noise.html#out-of-order-transport-messages