Open bboreham opened 10 years ago
When provisioning in certain public clouds, i.e. GCE and Azure, there exists free native DNS that one can use, thereby it's very easy and can be utilised without having static IPs.
So I can provision nodes with init scripts that source peer names from a files on disk, where the names are local DNS names. During provisioning, I don't necessarily have the control of the order, so DNS records might not be created for some peers yet.
Instead of re-resolving names each time we want to connect, I would create a Resolver class that keeps a cache of names that have been previously resolved. The ConnectionMaker could hold an instance of this Resolver. If a name is found in the cache and the resolution was not too long ago (determined by a TTL), the cached value would be returned. Otherwise, it would perform the resolution, save the value in the cache (question here is what to do when the cache is full) and then return the value.
The cache could also hold negative values (for names that do not resolve to anything), and a different timeout would apply for this cases...
An improved solution could use an increasing timeout, so each time the Resolver resolves a name and obtains the same value that was previously stored, it increases the TTL for that value.
Instead of re-resolving names each time we want to connect, I would create a Resolver class
That just adds complexity; performance really isn't an issue here.
There is a rather thorny issue here, which I discovered only very late into #548 (and which isn't addressed there)...
The original suggestion was that we want to
re-resolve the name each time we want to connect
However, the logic for determining whether we want to connect to something is based on the resolved address, since we need to check whether we already have a connection to that address. Furthermore, we also want to re-resolve names we previously resolved successfully, in order to account for changes to DNS.
The correct solution would base re-resolution decisions on the ttl supplied by DNS. But this requires implementing a full-blown resolver. That is no small feat - go's built-in resolver, which is quite a minimal affair, is many hundreds of lines of code (and, annoyingly does not expose the ttl in its API).
So #548 does something very crude. We resolve addresses every time the ConnectionMaker
figures out what to connect to, and we make sure this happens at least once every 10 minutes.
We could improve on that by introducing our own caching, but that just seems wrong.
Hence I shall close #548. If we ever get back to this it may make for reasonable starting point.
If weave is launched with one or more peers given as names rather than IP addresses, then the simple behaviour added by #111 is to resolve those names once at startup and then use that IP address from then on.
Better to re-resolve the name each time we want to connect, so we get an up-to-date IP address.
Needs some modifications to the structure in ConnectionMaker to remember the resolved IP address while that connection attempt is in flight.
This behaviour would also open the possibility that we have an address which does not resolve - this is perhaps a transitory condition so we should re-try that address later.