seiferff / distcc

Automatically exported from code.google.com/p/distcc
GNU General Public License v2.0
0 stars 0 forks source link

Distcc performs a DNS resolution for _EVERY_ compile #107

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Answering the following questions is a big help:

1. What version of distcc are you using (e.g. "2.7.1")?  You can run "distcc 
--version" to see.  If you got distcc from a distribution package rather than 
building from source, please say which one.

2. What platform are you running on (e.g. "Red Hat 8.0", "HP-UX 11.11")?  What 
compilare are you using ("gcc 3.3")?  Run "uname -a" and "cc --version" to see.

3. What were you trying to do (e.g. "install distcc", "build Mozilla")?

4. What went wrong?  Did you get an error message, did it hang, did it build a 
program that didn't work, did it not distribute compilation to machines that 
ought to get it?

5. If you have an example of a compiler invocation that failed, quote it, in 
full e.g.:
   distcc gcc -DHAVE_CONFIG_H -D_GNU_SOURCE -I./src \ "-DSYSCONFDIR=\"/etc/\"" -I./lzo -g -O2 -W -Wall -W \ -Wimplicit -Wshadow -Wpointer-arith -Wcast-align \ -Wwrite-strings -Waggregate-return -Wstrict-prototypes \ -Wmissing-prototypes -Wnested-externs -o src/clirpc.o \ -c src/clirpc.c

6. What error logging do you get?  Turn on client and server error logging.  On 
the client, set these environment variables, and try to reproduce the problem: 
=export DISTCC_VERBOSE=1 DISTCC_LOG=/tmp/distcc.log=.  Start the server with 
the --verbose option. If the problem is intermittent, leave logging enabled and 
then pull out the lines from the log file when the problem recurs.

7. If you got an error message on stderr, quote that error exactly. Find the 
lines in the log files pertaining to the compile, and include all of them in 
your report, by looking at the process ID in square brackets. If you can't work 
that out, quote the last few hundred lines leading up to the failure.

Original issue reported on code.google.com by JCalvinO...@gmail.com on 8 Apr 2012 at 5:37

GoogleCodeExporter commented 8 years ago
Apologies, I accidentally hit enter before I had typed the message... and 
apparently you can't edit these...

If you specify a DNS name in the hosts list, distcc will attempt to resolve it 
for _every_ remote compile... TBH, I think it would be fine if it just resolved 
it once, but even if you want to be really pedantic about it, shouldn't it at 
least wait for the duration of the TTL? Or a configurable time window? Even in 
the best case, that's 20ms of latency for every compile...

I get that distcc is executed anew every time it compiles something, but it 
doesn't seem as though it would be so difficult to add some sort of DNS 
caching. Would you be interested in this? I'd be happy to write up a patch for 
you.

`distcc --version`:
distcc 3.1 x86_64-pc-linux-gnu
  (protocols 1, 2 and 3) (default port 3632)
  built Mar 24 2012 18:14:36

`gcc --version`:
gcc (Gentoo 4.5.3-r2 p1.1, pie-0.4.7) 4.5.3

`uname -a`:
Linux Beethoven 3.4.0-rc2-amd #8 SMP Sun Apr 8 06:30:22 CDT 2012 x86_64 AMD 
FX(tm)-8120 Eight-Core Processor AuthenticAMD GNU/Linux

Original comment by JCalvinO...@gmail.com on 8 Apr 2012 at 5:46

GoogleCodeExporter commented 8 years ago
In my opinion, DNS caching is not something each application should implement 
considering something like nscd exists.

A latency of 20ms sounds like you're running neither nscd (part of glibc, gives 
me times <0.4ms for most names on repeated queries) nor a caching forwarder in 
your LAN (e.g. dnsmasq, gives 2-3ms for all names on repeated queries). There's 
also `unscd`, a "simple & stable nscd replacement". I don't use it myself 
because I have no problem with nscd.

Measurements were done with:
python -c "from socket import gethostbyname; from time import time; 
start=time(); gethostbyname('eris'); stop=time(); print stop-start"

using different host names

To try nscd: "eselect rc start nscd"

Original comment by Domo.Sok...@gmail.com on 12 Aug 2012 at 12:02

GoogleCodeExporter commented 8 years ago
I agree with you in general about applications not caching DNS requests, but I 
think this is something of an exception, given the frequency with which the 
lookups occur. Yes, I could improve my network's DNS latency, but that's not 
the point: I think you would find that in the wild, most people don't have 
low-latency DNS situations.

Even .4ms is still a lot of CPU time when added up. In a very large project 
(like the kernel), that could easily add up to over a second.

Original comment by JCalvinO...@gmail.com on 12 Aug 2012 at 5:51

GoogleCodeExporter commented 8 years ago
If you use pump mode and set DISTCC_POTENTIAL_HOSTS, which is what I would 
recommend, then DNS resolution will only happen once per build, not once per 
compile; the pump script will run "lsdistcc" with the "-n" flag to set 
DISTCC_HOSTS to the numeric IP addresses of the reachable distcc servers in 
DISTCC_POTENTIAL_HOSTS.

Original comment by fergus.h...@gmail.com on 4 Sep 2012 at 4:20

GoogleCodeExporter commented 8 years ago
Where is DISTCC_POTENTIAL_HOSTS described?

Original comment by dave@boostpro.com on 14 Mar 2014 at 1:40

GoogleCodeExporter commented 8 years ago
DISTCC_POTENTIAL_HOSTS  is documented in "man pump".

Original comment by fer...@google.com on 14 Mar 2014 at 1:45