discoproject / disco

a Map/Reduce framework for distributed computing
http://discoproject.org
BSD 3-Clause "New" or "Revised" License
1.63k stars 241 forks source link

DNS mappings to IP addresses not being cached? #634

Closed ghost closed 8 years ago

ghost commented 8 years ago

Started a small Disco cluster this weekend on Amazon EC2 machines and wondered why regional traffic popped up. 7GB in two afternoons.

All instances were in the same availability zone (us-east-1d) which means communication between these instances does not sum up to regional traffic (unless using public IPs, but I also searched for public IPs when I traced down the problem).

Debugged this further and found out that DNS is speaking to another Availability Zone (making up this regional traffic). In a short survey I found 18 DNS requests in 10 seconds from the master alone.

So I assume disco does not have any DNS-to-IP caching.

Is this not intended or just missing or even a bug? (depending on the answer to this, you can rate this issue as won't fix, improvement or bug).

Of course there are workarounds (at least a friend who proposed I should install dnsmasq says so).

Update (2016-02-08): Investigated this during the past week. I installed dnsmasq, logged the traffic with tcpdump and this should only make up 50kb per machine (according to tcpdump and a Disco-run for about 2 hours). Still, I have 600MB regional traffic... Gotta investigate this.

So, I guess that also the DNS'ing without caching was not responsible for all 7GB (but a significant decline in number of DNS requests could be seen after installing dnsmasq).

ghost commented 8 years ago

Finally found the culprit for the huge amount of traffic: Ubuntu has an update server inside AWS, but probably not inside my AZ or it's because it's getting resolved to a public IP. No matter what, but apt-get update causes so called regional traffic.

Unfortunately, now we do not know how much traffic all DNS requests acquire. I can only tell you that there are about 5 DNS requests per second on the master. And according to a friend of mine internal DNS caching is a weird idea anyway. He says using dnsmasq was the correct way to handle this issue.

So, case closed.