yggdrasil-network / yggdrasil-go

An experiment in scalable routing as an encrypted IPv6 overlay network
https://yggdrasil-network.github.io
Other
3.65k stars 246 forks source link

Decide what to do about various DHT optimizations #200

Closed Arceliar closed 5 years ago

Arceliar commented 6 years ago

A few different things we could try.

  1. Chord-like DHTs have a nice property that kad-like DHTs are missing: your successor (and/or predecessor) is a valid next-hop for any lookup. This means we can focus on maintaining good info about a very small (potentially constant) number of non-peer nodes, and then be comparatively lazy about checking for other nodes in the network (since they're just a performance optimization). In kad, by contrast, you could have a bucket with no working nodes, and nothing forces any nodes in any other buckets into still being a valid next hop for that region of keyspace. I have chord code working already, more or less, in a branch of my repo, but it need some cleanup / testing / optimizing, assuming we decide to go that route at all.

  2. Hypothetically, we could store some kind of connection info about public peers in the DHT. The idea is that, if a node is configured to auto-connect to public peers, and manages to connect to the network at all, then they should be able to find and connect to a couple of "useful" public peers somewhere in the network. The idea would be to help "gateway" nodes (local mesh nodes that peer to other parts of the network via another network, such as the internet) get more peers and avoid needlessly routing DHT lookups back and forth across the planet. In the short term, I'm more concerned with figuring out if this is a practical / good idea rather than deciding exactly what "useful" means, or how we store/lookup/communicate public peer info, I'm just assuming that the DHT is likely to be involved in any scale-able and decentralized approach, so I'm including it in this issue.

  3. Currently, the DHT only stores info about nodes it needs to know about. The same is true for the chord-like DHT. Hypothetically, we could also add in info about every node we get a response from when doing a search, and/or every node we have an open session with. We wouldn't add DHT pings for these nodes, unless we decide that they're important for the DHT independent of any use in searches or sessions. This means they'd get timed out if the session ends or we stop using those nodes as parts of new searches. This is info we come across anyway, so I'm just wondering if we can find something better to do with it than nothing at all. This could hypothetically cause a popular node to become more popular, leading to an imbalance of DHT traffic, so it needs some further study before we commit to doing or not doing anything with that information.

Arceliar commented 5 years ago

Decided to do 1. Decided not to do 3, at least for now, since it may tend to cause info about a few nodes to propagate widely through the network, which could put unnecessary load on them. The more I think about it, the more I'm worried that 2 would cause other problems, so closing this issue for now.