feat:dns load-balance - Githubissues

dyingtime commented 2 years ago

return a random ip to support load balance base on dns

szmarczak commented 2 years ago

Load balancing should be done by the server, not the client. Doing so can (and probably will) result in cache mismatches.

Even Node.js returns the first one: https://github.com/nodejs/node/blob/4d5ff25a813fd18939c9f76b17e36291e3ea15c3/lib/dns.js#L113

link89 commented 2 years ago

Hi @szmarczak I think nodejs return the first one is OK because it don't cache the resolve result locally, and the DNS server can return the addresses list with a random or round-robin order every time. But once they get cached by this lib, the DNS resolve results will be fixed until cache expired, so I think introduce randomize return would be better in this case.

szmarczak commented 2 years ago

nodejs return the first one is OK because it don't cache the resolve result locally

Not necessarily, no. Caching may be done at a kernel level, Windows is an example of this. Not sure if Windows respects TTL though.

the DNS server can return the addresses list with a random or round-robin order every time

That's the point! Round-robin is done by the server, not the client. Wikipedia says this as well.

the DNS resolve results will be fixed until cache expired, so I think introduce randomize return would be better in this case.

No. The order of the returned entries matters. Browsers try the first one, and if it fails, the next one. Failover is outside of our scope.

link89 commented 2 years ago

Hi @szmarczak Thanks for the reply. Actually we don't seek such solution for no reason. We suffer from the infamous nodejs EAI_AGAIN issue when the load to our micro-service component is high. More Reading It turns out that the DNS cache solution solve the problem but the side effect is it makes the load balance less effective as you may know that k8s' load balance is based on DNS resolve.

Let's said, the TTL if DNS response is 10 sec, without the cache, the load balance works correctly, but it will suffer from the EAI_AGAIN problem from time to time, with the cache, all traffic during those 10 sec will be forwarded to single endpoint.

I believe this lib is designed for nodejs so it has nothing to do with browser. Actually add randomness in this library make it more robust to error. You may consider the following scenario

A request is send to somehost.com DNS return a list with 2 ip address 172.1.1.100, 172.1.101, and the result is get cached by this lib for 10 seconds And unfortunately 172.1.1.100 is down now. Without the randomness, no matter how many times the client retry the requests, it will always failed as it will get the broken ip address until the cache expired. But with the randomness, at least it has chance to forward the request to the working one after retry the requests.

szmarczak commented 2 years ago

I believe this lib is designed for nodejs so it has nothing to do with browser.

DNS works in the same way regardless of the environment used.

This is not an issue in CacheableLookup, but in the client you use (HTTP / TCP / anything). You should open an issue there instead.

it will get the broken ip address until the cache expired.

Like I said before, the client (for example a browser) needs to have a list of IPs and try the other ones if the first one failed. This package enables this, but is not a failover implementation.

Failover is outside of our scope. I see no point in continuation this conversation, so I'm locking this issue as resolved.

szmarczak / cacheable-lookup

feat:dns load-balance #57