zerotier / ZeroTierOne

A Smart Ethernet Switch for Earth
https://zerotier.com
Other
13.84k stars 1.61k forks source link

Allow hard-coded routes for certain hosts #168

Closed alandekok closed 7 years ago

alandekok commented 9 years ago

I have a number of machines with static IPs. It would be nice to tell ZT1 that "hash X is at IP Y". And that routing can be done directly, without bouncing through the supernodes. And if hash X shows up at another IP, it's a problem.

adamierymenko commented 9 years ago

That wouldn't be hard to do -- it could be a local configuration option. Added to backlog.

jade-42 commented 8 years ago

This would be really useful for me. I have two members in a network that can't directly talk to each other, but a third member can talk to both.This would allow them to have a relatively low latency connection, without going through super-nodes.

heri16 commented 8 years ago

:+1: super-nodes should be last-resort, not the default fallback.

adamierymenko commented 8 years ago

Question is how to do this idiomatically and in a user-friendly way. The path of host-specific configuration leads down the road to things that are as awful to deal with as IPSec. Features are bugs.

We're already ahead since the system as-is works without such routes needing to be specified anywhere. The question is how to allow them to be specified.

What do you think about this?

  1. Pick a TLD that the DNS resolver allows but that will never be used in the wild. We'll have to look into this because we don't want to introduce a denial of service vector where someone can spoof DNS and force things to try paths that don't work. Of course if they only try paths and still follow the current path selection logic that would not be an issue.
  2. Allow you to define hosts locally (/etc/hosts or a local DNS server) in that faux-domain that are named according to the device ID. Example: f33df00d01.ztpath.SOMETLD. These could be DNS TXT records, which would allow them to contain both an IP and a port.
  3. The service looks for these. If it finds them, it tells the core to try the endpoint specified. If that path works it gets used.

What do you think?

adamierymenko commented 8 years ago

Actually TXT is problematic for several reasons. We could do this: 9993 is the default port, but an alternative can be specified by adding an additional false A record for 0.0.(alt port). If this is present that port will be used instead of 9993.

TXT is hard to query portably without bundling a third party DNS library as an add-on, and we want to limit that. Also all DNS servers and DNS UIs do not support it.

adamierymenko commented 8 years ago

Also wanted to mention: we intentionally designed ZeroTier not to depend on DNS, since that adds more fragility and introduces a circular dependency of you are running DNS over ZeroTier itself. But it's okay to use DNS to gather hints about connectivity.

Fundamentally I think the current mode of operation should be the bread-and-butter but that it's okay to augment it with hints to make things more robust, faster, etc. That's tuning vs. mandatory manual configuration. A user-friendly system ships with acceptable default behavior but can allow users to tweak it for maximum performance.

alandekok commented 8 years ago

I think DNS is useful for the "positive" use-case. i.e. map DNS record to machine. The negative use-case is problematic. i.e. for systems which don't use DNS mappings. Is ZT1 going to do a DNS lookup for every device ID? That would be problematic.

The better way, I think would be to allow ZT1 to save / load associations it's learned.

e.g. after it bounces through a supernode, it (at some point) determines that device ID X is at public IP Y. So why not save that? If it's in a simple format, users can edit it to their hearts content.

The saved configuration can be loaded on startup, and then treated normally. i.e. if a device ID isn't found at IP Y, then that "learned" association is discarded, and the system walks through the supernodes again.

This means that the only change is to the startup code. The "live" network code has no changes.

To be safe, you would probably want two files: one auto-saved by the system, and never touched by the user. The other one would be user-editable, and never changed by the system. Then, load the files at startup, and proceed as normal.

Even though it's 2015 (almost 2016!) editing a configuration file is still a normal thing to do.

janjaapbos commented 8 years ago

It would be nice if such secondary info is applied for a ZT service without dependency on root/system access right on the OS. This would not work for /etc/hosts and probably also not for DNS. This is only relevant for network containers though.

adamierymenko commented 8 years ago

That's a great point. I've considered introducing greater persistence in peer path information and re-trying old paths. Old paths are likely to be valid if the device hasn't moved and is behind either no NAT at all or a full-cone NAT. Symmetric and port-restricted NATs require connection re-setup every time from both ends.

Hmm... will think on this a bit more.

adamierymenko commented 8 years ago

But the nice thing about a convention-based manual specification feature is that it covers insane things like #225 -- crazy LANs where a valid path exists but finding it almost converges with the "hard AI problem." (Requires something as smart as a human.)

In those cases a user could just bite the bullet and set a config option.

Maybe I'm over-thinking it. Maybe it would be okay to just have a file in ZeroTier's home folder that if present is consulted for hints about potential routes. These hints are tried and used if they work, so if the file falls out of repair it doesn't break anything.

alandekok commented 8 years ago

The home directory / hints seems like the simplest approach. And yes, users will put terrible things in there. So relying on it will cause problems. Starting with it, and then discarding it if wrong, is the better approach.

janjaapbos commented 8 years ago

Yes, local config file is fine, and there are enough options to administrate them centrally or per datacenter. At the node management level you can determine whether local refinements are called for.

adamierymenko commented 8 years ago

The existing path selection logic covers that. ZeroTier can at any time attempt to use a path. If the path is confirmed via cryptographically-authenticated exchange then the path is considered valid. If a path is valid it's added to the sorted list of paths and used if it scores highest. Scoring is based on several things: latency, recentness of communication, and the class of address. Private addresses like 10.0.0.1 score higher than global ones since they are likely to be over a LAN instead of a WAN (faster).

heri16 commented 8 years ago

I think it is important to realize that private addresses do not always score better latency. In our case, we have an ipsec hub-and-spoke model on Amazon VPN Hub. Each location have nodes that can contact nodes in other locations via private ip, by going out of the ipsec gateway into the Internet, via ipsec router in Amazon in another country, and backwards again via ipsec tunnel into the other location. Naturally, latency could be worst than going over the Internet. Network controllers and humans should be able to define and tune the priority, Just like how software like Connectify Dispatch allows users to classify each connection into 3 classes, Primary, Secondary, Backup.

  1. Primary - Free and large bandwidth but possible uplink congestion and connectivity problems like the internet.
  2. Secondary - Low-latency but limited bandwidth links like Metronet and MPLS.
  3. Backup - Metered Bandwidth links like Cellular 3G/4G.

Backup means don't use unless absolutely needed (likely Metered 3G Cellular Bandwidth). Secondary means use if it helps lower latency while all primaries are congested. There could be multiple primary (round-robin), and multiple secondary connections.

heri16 commented 8 years ago

The interesting thing for me is that even though 2 paths appear in zerotier-cli listpeers, the private connection is always chosen. And when the private connection is down, Zerotier-one just throws hands in the air and blackhole packets, instead of using the Internet route which it obviously knows about.

adamierymenko commented 8 years ago

How long does it "throw its hands in the air?" It should technically start using the other link after ~60-120 seconds in current code unless something is wrong.

I'm thinking about improved dead path detection. One possibility I've been considering (and partly inspired by things in BGP BPD) is to add a feature where ZT confirms the link on cessation of traffic.

If we are having a conversation, if I send you a packet I should generally get a reply within at least a few seconds. If I don't, maybe I'll send you a confirmation just to ask "hey, are you still there?" If you're still there great... maybe our conversation is over. But if you don't answer the confirmation it probably means the link is down and I should reset and re-establish.

If the confirm on cessation timeout were, say, 2s, this would drop dead path failover from 60-120 seconds down to 2-4 seconds at the cost of two extra tiny packets at the end of every burst of traffic.

adamierymenko commented 8 years ago

BTW it sounds like it would also be useful to be able to blacklist paths. Maybe our proposed ZeroTier equivalent of /etc/hosts could allow paths to be whitelisted as "please try this path" or blacklisted as "do not use."

Something like:

 deadbeef01 10.0.0.1/9993 -192.168.0.1 -192.168.100.1
 feedbeef02 10.0.0.2/9993 -192.168.0.2 -192.168.100.2

This would tell my node to always try path 10.0.0.1/9993 to deadbeef01 and to never use 192.168.0.1 or 192.168.100.1 (any port) to talk to that endpoint.

heri16 commented 8 years ago

About 5 to 10 mins of wall clock.

alandekok commented 8 years ago

If you want to detect dead links, I would suggest BFD (https://tools.ietf.org/html/rfc5880) It's meant for link status detection. I've implemented it before. It was ~2K LoC with comments and whitespace (https://github.com/FreeRADIUS/freeradius-server/blob/v3.1.x/src/modules/proto_bfd/proto_bfd.c)

You could also use bandwidth estimation tools (http://www.icir.org/models/tools.html) to see how large the various connections are, and what the latency is.

These aren't trivial changes, but together they would mean that ZT1 would automatically discover the "best' connection, and also determine if / when that connection is unresponsive.

adamierymenko commented 8 years ago

Also see discussion on #270

adamierymenko commented 7 years ago

Been done for a while.