mozilla / sccache

Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.
Apache License 2.0
5.75k stars 542 forks source link

`-B` is not used during compiler resolution #1102

Open indygreg opened 2 years ago

indygreg commented 2 years ago

I was attempting to use sccache when bootstrapping GCC:

CC="sccache /usr/bin/gcc" \
CXX="sccache /usr/bin/g++" \
./configure
...
make STAGE_CC_WRAPPER=sccache

(Note: GCC configure doesn't check for or retain STAGE_CC_WRAPPER: only Makefile does.)

If you attempt this, eventually things fail when it tries to use the just-built compiler to subsequently build:

sccache /home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/xgcc -B/home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/ -dumpspecs > tmp-specs
sccache: error: failed to execute compile
sccache: caused by: Compiler not supported: "xgcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory\ncompilation terminated.\n"

Indeed, we can reproduce outside the context of make:

sccache /home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/xgcc -B/home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/
sccache: error: failed to execute compile
sccache: caused by: Compiler not supported: "xgcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory\ncompilation terminated.\n"

But, if we supplement PATH to include the value specified via -B, it works!

PATH=/home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc:$PATH sccache /home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/xgcc -B/home/gps/src/gcc/host-x86_64-pc-linux-gnu/gcc/ --help
Usage: xgcc [options] file...
Options:
  -pass-exit-codes         Exit with highest error code from a phase.
  --help                   Display this information.
  --target-help            Display target specific command line options.
  --help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...].
...

I think the problem here is detect_c_compiler() invokes <compiler> -E to determine the compiler type. But this specific compiler needs -B to tell it where binaries like cc1 reside:

host-x86_64-pc-linux-gnu/gcc/xgcc -E foo.c
xgcc: fatal error: cannot execute ‘cc1’: execvp: No such file or directory
compilation terminated.

host-x86_64-pc-linux-gnu/gcc/xgcc -B host-x86_64-pc-linux-gnu/gcc -E foo.c
cc1: fatal error: foo.c: No such file or directory
compilation terminated.

(Admittedly the behavior of xgcc is a bit wonky here but that's what GCC's bootstrap does.)

Should sccache make an attempt to proxy -B into detect_c_compiler()?

If proxying -B is too hard, should the <compiler> -E invocation to resolve its kind perhaps set an environment variable like PATH to give the compiler a better chance of finding its peer binaries?

Should -B arguments influence the compiler uniqueness / info caching since they can effectively point the compiler driver at an alternate set of binaries like cc1? (Admittedly it seems a bit wonky to use different -B values to change the sccache identify/flavor/kind of the compiler by using a shared driver for different backends. But it is theoretically possible.)

indygreg commented 2 years ago

I tried to work around this by installing a shell script wrapper for sccache that ensures PATH contains the intermediate GCC compiler directory. e.g.

#!/bin/sh
set -e

path=$(dirname $1)
export PATH=$path:$PATH
exec /usr/bin/sccache "$@"

However, this still fails with the same failure:

sccache-wrapper.sh /build/gcc-objdir10/./gcc/xgcc -B/build/gcc-objdir10/./gcc/ -dumpspecs > tmp-specs
sccache: error: failed to execute compile
sccache: caused by: Compiler not supported: "xgcc: fatal error: cannot execute \'cc1\': execvp: No such file or directory\ncompilation terminated.\n"

If I run sccache-wrapper.sh /build/gcc-objdir10/./gcc/xgcc -E testfile.c (where testfile.c is the source for compiler sniffing) it works though. I'm not sure what's going on. I tried clearing the compiler cache to make sure the error wasn't being cached but that didn't make a change.

I'm running 0.2.15 if that matters.

yuanfang-chen commented 2 years ago

Hmm, interesting. The second stage build of bootstrapping does not make much sense for caching since the compiler version is part of the hash key, xgcc is technically a distinct compiler that could not get a cache hit. Distribution makes sense though.

indygreg commented 2 years ago

Hmm, interesting. The second stage build of bootstrapping does not make much sense for caching since the compiler version is part of the hash key, xgcc is technically a distinct compiler that could not get a cache hit.

My goal here is to speed up repeated rebuilds / bootstrapping of the GCC compiler toolchain. I was naively assuming that if the build environment were mostly deterministic and reproducible (which I've ensured via strategic use of containers) that sccache would be smart enough to key cache entries from the stage 1 and stage 2 compilers and caching wouldn't interfere with the bootstrap results. Is this not the case?

The Plot Thickens

I do not encounter this failure when building with make -j1!

If I look at the build logs, we clearly see a make -j1 materialize cc1 before xgcc and before the xgcc -dumpspecs invocation:

sccache /usr/bin/g++    -g -O2 -DIN_GCC     -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc  -o cc1 c/c-lang.o c-family/stub-objc.o attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o c/c-convert.o c/c-aux-info.o c/c-objc-common.o c/c-parser.o c/c-fold.o c/gimple-parser.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o c-family/c-spellcheck.o i386-c.o glibc-c.o \
...
sccache /usr/bin/g++    -g -O2 -DIN_GCC     -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc  -o xgcc gcc.o gcc-main.o ggc-none.o
...
sccache-wrapper.sh /build/gcc-objdir10/./gcc/xgcc -B/build/gcc-objdir10/./gcc/ -dumpspecs > tmp-specs

But, if I build with -j64, I see xgcc materializing before cc1:

sccache /usr/bin/g++    -g -O2 -DIN_GCC     -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc  -o xgcc gcc.o gcc-main.o ggc-none.o \
...
sccache-wrapper.sh /build/gcc-objdir10/./gcc/xgcc -B/build/gcc-objdir10/./gcc/ -dumpspecs > tmp-specs
...
sccache /usr/bin/g++    -g -O2 -DIN_GCC     -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc  -o cc1 c/c-lang.o c-family/stub-objc.o attribs.o c/c-errors.o c/c-decl.o c/c-typeck.o c/c-convert.o c/c-aux-info.o c/c-objc-common.o c/c-parser.o c/c-fold.o c/gimple-parser.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o c-family/c-spellcheck.o i38
6-c.o glibc-c.o \

So when building in parallel there is a race between xgcc -dumpspecs being invoked and the support binaries like cc1 coming into existence. Due to the way GNU Make implicitly priorities targets based on file / evaluation order, a -j1 build just happens to work despite xgcc not depending on cc1. This normally wouldn't matter from the perspective of GCC's build system except that sccache invokes <compiler> -E the first time a given compiler is invoked and this invocation requires those support binaries!

I confirmed this race by changing my sccache wrapper shell script to the following:

#!/bin/sh

set -e

dir=$(dirname $1)
cc1=${dir}/cc1

if [ -e "${cc1}"  ]; then
  export PATH=${dir}:${PATH}
  exec /toolchains/bin/sccache "$@"
else
  echo "no $cc1; $*" >> ~/sccache-debug.log
  exec "$@"
fi

Pointing STAGE_CC_WRAPPER at this wrapper script results in a working bootstrap build with the following entries in the debug log:

no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -dumpspecs
no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -print-sysroot-headers-suffix
no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -dumpspecs
no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -print-sysroot-headers-suffix
no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -dumpspecs
no /build/gcc-objdir/./gcc/cc1; /build/gcc-objdir/./gcc/xgcc -B/build/gcc-objdir/./gcc/ -print-sysroot-headers-suffix

Note that the race between xgcc being invoked and cc1 coming into existence is a supplemental issue: the failure to handle -B when sccache invokes xgcc -E still exists. I can confirm this because if I delete the PATH modifications from my wrapper script, I still see the failures due to missing cc1 even though cc1 exists. That's presumably because xgcc doesn't look for support binaries in its own directory and relies on those -B values.

I could presumably fix the race condition by modifying gcc/Makefile.in so xgcc depends on support binaries like cc1. But that's a >4k make file and I don't feel like getting my hands that dirty. It would be tempting to send that patch upstream. But GCC developers may reject it since their build system today is sound in isolation, since xgcc -dumpspecs doesn't itself have a dependency on cc1. However, the benefits to using sccache during a 3 stage bootstrap are significant. Here are some stats on the initial 3 stage bootstrap with sccache working:

Compile requests                  26142
Compile requests executed         18656
Cache hits                         1898
Cache hits (C/C++)                 1898
Cache misses                      15962
Cache misses (C/C++)              15962
Cache timeouts                        0
Cache read errors                     0
Forced recaches                       0
Cache write errors                   48
Compilation failures                624
Cache errors                        172
Cache errors (C/C++)                172
Non-cacheable compilations            0
Non-cacheable calls                2482
Non-compilation calls              5004
Unsupported compiler calls            0
Average cache write               0.149 s
Average cache read miss           1.204 s
Average cache read hit            0.056 s
Failed distributed compilations       0

Non-cacheable reasons:
-E                                 1462
unknown source language             617
argument parse                      241
-x                                  162

And on a subsequent build:

Compile requests                  26142
Compile requests executed         18656
Cache hits                        17849
Cache hits (C/C++)                17849
Cache misses                         11
Cache misses (C/C++)                 11
Cache timeouts                        0
Cache read errors                     0
Forced recaches                       0
Cache write errors                    0
Compilation failures                624
Cache errors                        172
Cache errors (C/C++)                172
Non-cacheable compilations            0
Non-cacheable calls                2482
Non-compilation calls              5004
Unsupported compiler calls            0
Average cache write               0.107 s
Average cache read miss           0.196 s
Average cache read hit            0.159 s
Failed distributed compilations       0

Non-cacheable reasons:
-E                                 1462
unknown source language             617
argument parse                      241
-x                                  162

If my grep command is accurate, 19,389 of those invocations are from sccache-wrapper.sh, which is only installed via STAGE_CC_WRAPPER (sccache is used directly elsewhere). That's a lot of CPU time saved by sccache!

yuanfang-chen commented 2 years ago

Yeah, if you need to bootstrap build a specific commit of GCC several times, caching should help. Despite that, -B not being considered during compiler check is somewhat worrying since the compiler version could be wrong (cache has more choices in this regard with CCACHE_COMPILERCHECK). I'll see if I could come up with a patch.

sccache looks at the expanded __VERSION__ macro to know the compiler version. It is not surprising that xgcc may consult cc1 for expanding the macro.

In the long run, it is better to implement CCACHE_COMPILERCHECK for sccache, but that's a separate issue.

yuanfang-chen commented 2 years ago

Confirmed the compiler version could be wrong with -B. On my Ubuntu with both gcc10/gcc7 installed:

➜  /tmp cat a.c
__VERSION__
➜  /tmp /usr/bin/x86_64-linux-gnu-gcc-10 -B /usr/lib/gcc/x86_64-linux-gnu/7 -E a.c
# 1 "a.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "a.c"
"7.5.0"
➜  /tmp /usr/bin/x86_64-linux-gnu-gcc-10 -E a.c
# 1 "a.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "a.c"
"10.3.0"