akhilravidas / distcc

Automatically exported from code.google.com/p/distcc
GNU General Public License v2.0
0 stars 0 forks source link

distcc --show-hosts fails when using Avahi with IPv6 support and +zeroconf for distcc. #42

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Answering the following questions is a big help:

1. distcc 3.0 i686-pc-linux-gnu
  (protocols 1, 2 and 3) (default port 3632)
  built Mar 11 2009 14:16:41
Copyright (C) 2002, 2003, 2004 by Martin Pool.
Includes miniLZO (C) 1996-2002 by Markus Franz Xaver Johannes Oberhumer.
Portions Copyright (C) 2007-2008 Google.

distcc comes with ABSOLUTELY NO WARRANTY.  distcc is free software, and
you may use, modify and redistribute it under the terms of the GNU 
General Public License version 2 or later.

Built with Zeroconf support.

2. Linux localhost.localdomain 2.6.25-gentoo-r9 #2 SMP Fri Nov 14 09:35:46
GMT 2008 i686 Intel(R) Xeon(TM) CPU 2.40GHz GenuineIntel GNU/Linux

gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.1)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

3. Compile any software package. Also  distcc --show-hosts

4. Error Message unable to parse the host file.
cat .distcc/zeroconf/hosts
10.0.0.179:3632/16
10.0.0.135:3632/8
fe80::2c0:9fff:fe3d:1737:3632/16
2001:470:1f09:22e:214:22ff:fe2d:1d7f:3632/8

5. distcc --show-hosts
distcc[29525] (dcc_parse_tcp_host) ERROR: invalid tcp port specification in
":2c0:9fff:fe3d:1737:3632/16
2001:470:1f09:22e:214:22ff:fe2d:1d7f:3632/8
"
distcc[29525] (dcc_zeroconf_add_hosts) CRITICAL! failed to parse host file.

distcc[29525] (dcc_show_hosts) CRITICAL! Failed to get host list

6. See 5, the IPv6 address is missing the beginning of its address should
have fe80.

7. See 5.

Original issue reported on code.google.com by goo...@hayward.uk.com on 12 Mar 2009 at 11:07

GoogleCodeExporter commented 9 years ago
What is in your ~/.distcc/.hosts file?

If you have IPv6 addresses in your hosts file, then I think you need to put the 
IPv6 addresses inside 
square brackets.
If you have only +zeroconf in your hosts file, then I guess it is a bug in 
distcc; but the solution probably 
involves additional square brackets.

Original comment by fergus.h...@gmail.com on 12 Mar 2009 at 11:13

GoogleCodeExporter commented 9 years ago
The only thing in the hosts file is +zeroconf.

The first IPv6 address is incorrect :2c0:9fff:fe3d:1737:3632/16 (distcc 
--show-hosts)
should be fe80::2c0:9fff:fe3d:1737:3632/16 (from .distcc/zeroconf/hosts). The 
second
address is correct 2001:470:1f09:22e:214:22ff:fe2d:1d7f:3632/8(in both
.distcc/zeroconf/hosts and distcc --show-hosts).

I am not convinced about square brackets when the first address is clearly 
wrong,
however this may be required as well.

Original comment by goo...@hayward.uk.com on 14 Mar 2009 at 9:47

GoogleCodeExporter commented 9 years ago
You are right that the hosts file requires square brackets around the IPV6 
address.
Just had a look in the source.

static int dcc_parse_tcp_host(struct dcc_hostdef *hostdef,
                              const char * const token_start)
{
    int ret;
    const char *token = token_start;

    if (token[0] == '[') {
    /* We have an IPv6 Address */
    if ((ret = dcc_dup_part(&token, &hostdef->hostname, "/] \t\n\r\f,")))
        return ret;
    if(token[0] != ']') {
        rs_log_error("IPv6 Hostname requires closing ']'");
        return EXIT_BAD_HOSTSPEC;
    }
    token++;

It must be the that ZEROCONF code that generates the hosts file in the first 
place
generates in incorrect IPV6 address. I'm going to poke around and see if I can 
find
something obvious.

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 12:31

GoogleCodeExporter commented 9 years ago
Just keeping notes here. On Ubuntu 9.04 by default the avahi zeroconf daemon is 
not
IPV6 enabled. See

/etc/avahi/avahi-daemon.conf

[server]
#host-name=foo
#domain-name=local
#browse-domains=0pointer.de, zeroconf.org
use-ipv4=yes
use-ipv6=no
#check-response-ttl=no
#use-iff-running=no
#enable-dbus=yes
#disallow-other-stacks=no
#allow-point-to-point=no

My guess at the moment is that for some-reason distccd is broadcast that it is 
on
interface eth0 IPv4 but then puts in an IPV6 address into the Address field of 
the
zeroconf record. 

looking at line 165 of zeroconf.c in the distcc pacakge I see.

    if (h->address.proto == AVAHI_PROTO_INET6)
        snprintf(t, sizeof(t), "[%s]:%u/%i\n", avahi_address_snprint(a, sizeof(a),
&h->address), h->port, d->n_slots * h->n_cpus);
    else
        snprintf(t, sizeof(t), "%s:%u/%i\n", avahi_address_snprint(a, sizeof(a),
&h->address), h->port, d->n_slots * h->n_cpus);

if AVAHI_PROTO_INET6 is set then square brackets are used. If otherwise it 
prints
without square brackets. Given that IPV6 in avahi is disabled I don't think we 
will
see AVAHI_PROTO_INET6 and therefore there will be no square brackets. 

Next thing to look at is the distccd code that broadcasts the record. Maybe the
answer is there.

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 1:28

GoogleCodeExporter commented 9 years ago
Proof positive something dodgy is going on with the daemon and address 
publishing....

brad@bradgonesurfing:~/build/distcc/trunk$sudo /etc/init.d/distcc restart &&
avahi-browse _distcc._tcp -rtkv

 * Restarting Distributed Compiler Daemon: distccd                                  

                          [ OK ] 
Server version: avahi 0.6.23; Host name: bradgonesurfing.local
E Ifce Prot Name                                          Type                 
Domain
+ eth0 IPv4 distcc@dachstein                              _distcc._tcp         
local
= eth0 IPv4 distcc@dachstein                              _distcc._tcp         
local
   hostname = [dachstein.local]
   address = [fe80::219:d1ff:feb0:2f65]
   port = [3632]
   txt = ["cc_machine=i486-linux-gnu" "cc_version=4.3.3" "gnuhost=i486-pc-linux-gnu"
"distcc=3.1" "cpus=2" "txtvers=1"]
: Cache exhausted
+ eth0 IPv4 distcc@bradgonesurfing                        _distcc._tcp         
local
= eth0 IPv4 distcc@bradgonesurfing                        _distcc._tcp         
local
   hostname = [bradgonesurfing.local]
   address = [192.168.1.4]
   port = [3632]
   txt = ["cc_machine=i486-linux-gnu" "cc_version=4.3.3" "gnuhost=i486-pc-linux-gnu"
"distcc=3.1" "cpus=2" "txtvers=1"]
: All for now

Note how avahi marks it as IPV4 eth0 but puts in an IPV6 address into the 
address
field. Why is that happening? 

line 84 in zeroconf-reg.c has a call to 

        if (avahi_entry_group_add_service(
                    ctx->group,
                    AVAHI_IF_UNSPEC,
                    dcc_proto,
                    0,
                    ctx->name,
                    DCC_DNS_SERVICE_TYPE,
                    NULL,
                    NULL,
                    ctx->port,
                    "txtvers=1",
                    cpus,
                    "distcc="PACKAGE_VERSION,
                    "gnuhost="GNU_HOST,
                    v ? version : NULL,
                    m ? machine : NULL,
                    NULL) < 0) {

However the address in not specified. This seems to be a job for avahi to fill 
in. Is
this a bug with avahi or is the above function being called with incorrect 
parameters.

BTW My distcc version is

brad@bradgonesurfing:~/workspace/server$distccd --version
distccd 3.1 i486-pc-linux-gnu
  (protocols 1, 2 and 3) (default port 3632)
  built Oct  4 2009 16:29:06
Copyright (C) 2002, 2003, 2004 by Martin Pool.
Includes miniLZO (C) 1996-2002 by Markus Franz Xaver Johannes Oberhumer.
Portions Copyright (C) 2007-2008 Google.

distcc comes with ABSOLUTELY NO WARRANTY.  distcc is free software, and
you may use, modify and redistribute it under the terms of the GNU 
General Public License version 2 or later.

Built with Zeroconf support.

Please report bugs to distcc@lists.samba.org

I think I am lost at this point. Hopefully the information I've been able to 
dig up
is usefull to somebody in tracking this problem down.

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 2:11

GoogleCodeExporter commented 9 years ago
There was a previous issue related to IPV6 which was fixed. I link it here just 
for
cross reference.

http://code.google.com/p/distcc/issues/detail?id=34

And the ifconfig for the machine broadcasting the IPV6 address is

Last login: Thu Oct  8 16:25:58 2009 from bradgonesurfing.local
brad@dachstein:~$ ifconfig
eth0      Link encap:Ethernet  HWaddr 00:19:d1:b0:2f:65  
          inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::219:d1ff:feb0:2f65/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1150788 errors:0 dropped:15 overruns:0 frame:0
          TX packets:969064 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:1398858971 (1.3 GB)  TX bytes:688116937 (688.1 MB)
          Memory:d2200000-d2220000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:240 (240.0 B)  TX bytes:240 (240.0 B)

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 2:44

GoogleCodeExporter commented 9 years ago
This thread might throw some light but to be honest it is a bit over my head. 
There
seems to be a problem with IPV6 link local addresses and AVAHI. The address 
above is
a link local one ie it starts with "fe80".

http://lists.freedesktop.org/archives/avahi/2007-February/000959.html

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 2:56

GoogleCodeExporter commented 9 years ago
My /etc/avahi/avahi-daemon.conf
contains

--SNIP--
[server]
#host-name=foo
#domain-name=local
browse-domains=0pointer.de, zeroconf.org
use-ipv4=yes
use-ipv6=yes
#allow-interfaces=eth0
#deny-interfaces=eth1
#check-response-ttl=no
#use-iff-running=no
#enable-dbus=yes
#disallow-other-stacks=no
#allow-point-to-point=no
--SNIP--

Original comment by goo...@hayward.uk.com on 8 Oct 2009 at 3:11

GoogleCodeExporter commented 9 years ago
Currently I'm going to try to disable IPV6 in Ubuntu 9.04 and see if the 
problem goes
away. This is how to do it.

http://ubuntuforums.org/showpost.php?p=7858319&postcount=38

Original comment by bradphe...@gmail.com on 8 Oct 2009 at 3:24

GoogleCodeExporter commented 9 years ago
I have been trying to get all my networks IPv6 enabled not disabled, I need to 
run it
for test environments against software builds and was hoping to have avahi 
running
IPv6 but when I enable it it breaks by build farm :S

Original comment by goo...@hayward.uk.com on 8 Oct 2009 at 3:47

GoogleCodeExporter commented 9 years ago
Are there any changes in relation to this issue? Disabling ipv6 on each of the 
hosts
(as suggested in comment #9) helps a bit and lowers the amount of warnings 
during
compilation but doesn't fix them completely.

Moreover it is not a good solution as one might have quite a lot of hosts and
disabling ipv6 on each of them can be quite annoying.

Is there anything I can do to help fix this issue?

Original comment by madkinder on 24 Jan 2010 at 9:25

GoogleCodeExporter commented 9 years ago
There are two issues as I see it, the first is that Avahi is placing IPv6 
addresses
in IPv4 fields, the other is why distcc gets all of the following IPv6 address
correct but gets the first one wrong, for some reason it is striping the first 
for
digits before the colon, why is it not the same for all address and that they 
are all
broken?

Original comment by goo...@hayward.uk.com on 24 Jan 2010 at 9:58

GoogleCodeExporter commented 9 years ago
The solutions presented in Issue 34 resolve the IPv6 errors for me. The patches 
were 
committed to trunk in revisions 650 and 673. I built and tested revision 717 on 
Ubuntu 9.04 (with IPv6 enabled) in our IPv4 networked environment, and the 
errors 
went away. Previously, we were rendering our own workstations unusable with 
-j80 
compiles when the error was encountered.

As of today, the latest distcc release (version 3.1) does not include these 
changes. 
Oh, I had to run 'sh configure --disable-Werror' in order to build revision 717 
successfully in my environment.

Original comment by cbe...@gmail.com on 15 Apr 2010 at 9:36

GoogleCodeExporter commented 9 years ago
As Comment 13 points out, this bug was fixed back in Issue 34, but for some 
reason got left out of the 3.1 release. Perhaps someone did a poor job merging 
branches at some point and accidentally wiped out the fix?

In any case, since this issue is not getting any response here, I have 
submitted the patch that fixes this to Ubuntu. It is now available in oneiric:

https://bugs.launchpad.net/ubuntu/+source/distcc/+bug/809534/comments/3

Original comment by taylor.j...@gmail.com on 17 Jul 2011 at 4:48

GoogleCodeExporter commented 9 years ago
The bug was fixed *after* the release of distcc 3.1.
The lack of a time machine prevented the fix from being included in the distcc 
3.1 release :)
The bug is fixed at head and will be included in the next release of distcc.
The real problem here is that it's been a long time since the last release.
Must be time to do a new release soon...

Original comment by fergus.h...@gmail.com on 28 Jul 2011 at 10:09