cncf / demo

Demo of CNCF technologies
https://cncf.io
Apache License 2.0
77 stars 39 forks source link

Get distcc compiles working #122

Open namliz opened 7 years ago

namliz commented 7 years ago

Specifically:

namliz commented 7 years ago

Arch wiki is a good resource as usual: https://wiki.archlinux.org/index.php/Distcc

The man page: https://linux.die.net/man/1/distcc

Some common sense. Opting for Debian stable.

Distcc slaves should start first so master can get DISTCC_HOSTS/DISTCC_POTENTIAL_HOSTS as an environment variable in this case, without overly complicated discovery.

Will add what else I'm missing in next comment.

namliz commented 7 years ago

distcc wants an ALLOWEDNETS environment variable to look for peers.

I'm going to cheat and just proceed with 10.244.0.0/16 which is good for my cluster, but this shouldn't be hard coded. Where exactly to pull the cluster subnets from is an interesting aside I'll add to the backlog instead of lingering on this now.

namliz commented 7 years ago

https://lists.samba.org/archive/distcc/2007q4/003593.html

There was a zeroconf patch, it looks like it has been mainlined since it was announced because when even with ZEROCONF="false" I get:

 distcc --show-hosts
distcc[639] (dcc_parse_hosts) Warning: /root/.distcc/zeroconf/hosts contained no hosts; can't distribute work
distcc[639] (dcc_zeroconf_add_hosts) CRITICAL! failed to parse host file.

While generally useful, I'd rather turn off zeroconf as it isn't necessary for our purposes. Looking for a way.

namliz commented 7 years ago

https://ubuntuforums.org/archive/index.php/t-1747376.html

It seems that there's some bug with distcc/avahi that causes this problem.

Take out "+zeroconf" out of the global hosts file (/etc/distcc/hosts) and things should work as expected.

That's correct. Now it looks in: /etc/distcc/hosts as desirable. Onwards. EDIT: DISTCC_HOSTS also clobbers correctly.

namliz commented 7 years ago

What's left:

Replication Controller to start $(Number of Nodes - 1) slave replicas and a master distcc. Need to pass the work to the master somehow, easy way out for the short term is just to bake a job script into the image.

Tear down

For now might not have to care about the tear down. Single shot.

namliz commented 7 years ago

Note on pump mode, apparently this is impossible to use when compiling a Kernel.

It starts off trying to distribute to the salves, then they get a "wrong result" and the master proceeds to compile locally and ignore the slaves. The slaves for some reason keep chugging along, I guess work that is already thrown away.

So the graphs looked right until I noticed the master finished with a slave still chugging along at something, which should be impossible. The only give away is a blink and you miss it note in the distcc log:

distcc[8235] ERROR: compile arch/x86/kernel/asm-offsets.c on 10.244.0.6,cpp,lzo failed
distcc[8235] (dcc_build_somewhere) Warning: remote compilation of 'arch/x86/kernel/asm-offsets.c' failed, retrying locally
distcc[8235] Warning: failed to distribute arch/x86/kernel/asm-offsets.c to 10.244.0.6,cpp,lzo, running locally instead
distcc[8235] (dcc_please_send_email_after_investigation) Warning: remote compilation of 'arch/x86/kernel/asm-offsets.c' failed, retried locally and got a different result.
distcc[8235] (dcc_please_send_email_after_investigation) Warning: file 'include/generated/autoconf.h', a dependency of arch/x86/kernel/asm-offsets.c, changed during the build
distcc[8235] (dcc_note_discrepancy) Warning: now using plain distcc, possibly due to inconsistent file system changes during build

So pump mode would give in theory a nice 30%-50% speed boost but I haven't found anybody who successfully used it with the kernel. If you're reading this and have an insight do get in touch.