mozilla / sccache

Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.
Apache License 2.0
5.74k stars 542 forks source link

Wrong target architecture detected on arm64 macOS #2093

Open blaztinn opened 7 months ago

blaztinn commented 7 months ago

clang and make on macOS are fat binaries with both having arm64 and x86_64 parts. You can select which architecture you want to run with arch command and here is where sccache fails to detect the right target architecture when caching compiled artifacts. It always returns the artifacts for the architecture you compiled with first (probably because of not hashing the (right) target architecture?).

A quick way to reproduce this on arm64 Apple HW:

  1. Create a simple main.c (int main(){})
  2. Create a makefile:
    build:
        sccache -- clang -c main.c -o main-with-sccache
        clang -c main.c -o main-without-sccache
  3. Run make build:
    $ file main-*
    main-with-sccache:    Mach-O 64-bit object arm64
    main-without-sccache: Mach-O 64-bit object arm64
  4. Run arch -x86_64 make build:
    $ file main-*
    main-with-sccache:    Mach-O 64-bit object arm64
    main-without-sccache: Mach-O 64-bit object x86_64

    As you can see the sccache cached the arm64 binary on the first compilation and returned it when running the x86_64 compiler targeting the x86_64 binary. On the other hand if we first compiled - with an empty cache - for x86_64 we would always get an x86_64 binary.

This example needs the make command because sccache I downloaded is an arm64-only binary and directly calling arch -x86_64 sccache fails.

Note: Having sccache -- clang --version inside the makefile and running the makefile with arch -x86_64 make reports that clang is an x86_64 binary, so the arm64-only sccache binary forwards commands to the right compiler arch.

glandium commented 7 months ago

TL;DR: Add -arch x86_64 to your CFLAGS instead of running the entire build as x86_64. There are too many things that can go wrong with the way you're doing it (plus, it's slower to build the way you do).

Full version: sccache can only really work with what it can know. What it can know is:

blaztinn commented 7 months ago

Thank you for the detailed answer, I really appreciate it!

I will use the native compiler and set the target arch or triple, thank you for pointing that out. This solves my issue and will hopefully also make the compiling faster (because of no emulation as you said).

But I still think this is a bug in sccache. Because sccache clang <args> should always produce the same output as running clang <args> directly.

I think the problem is in sccache using the whole fat binary of clang to detect the compiler binary while actually only the specific arch of that binary is ever used to begin with. This is why I think that sccache should hash both the binary and the target arch that is being used.

Luckily the bug only shows when mixing the usage of sccache from both arm64 and x86_64 environments, so it is hard to trigger in practice.

Side note: I wonder also if clang has a hardcoded default target triple or does it depend on the environment it is run in (ignoring the flags passed to the compiler)?

Subprocesses

To show that arch -x86_64 propagates to subproccesses I've added next lines to main() in sccache:

    {
        let output =
            Command::new("uname")
                .arg("-m")
                .output()
                .expect("failed to execute process")
                .stdout;

        println!("uname -m: {0}", String::from_utf8(output).unwrap());
    }

    {
        let output =
            Command::new("arch")
                .output()
                .expect("failed to execute process")
                .stdout;

        println!("arch: {0}", String::from_utf8(output).unwrap());
    }

    {
        let output =
            Command::new("clang")
                .arg("-dumpmachine")
                .output()
                .expect("failed to execute process")
                .stdout;

        println!("clang -dumpmachine: {0}", String::from_utf8(output).unwrap());
    }

Running ./sccache -V through make I get:

blaz.tomazic@Blazs-MacBook-Pro sccache-example % file sccache 
sccache: Mach-O 64-bit executable arm64
blaz.tomazic@Blazs-MacBook-Pro sccache-example % make version-sccache             
./sccache -V
uname -m: arm64

arch: arm64
clang -dumpmachine: arm64-apple-darwin23.3.0

sccache 0.7.7
blaz.tomazic@Blazs-MacBook-Pro sccache-example % arch -x86_64 make version-sccache
./sccache -V
uname -m: x86_64

arch: i386
clang -dumpmachine: x86_64-apple-darwin23.3.0

sccache 0.7.7

I don't know how OS decides which arch of a binary to spawn in a subprocess. Dumping sysctl values and env vars didn't show any differences. Also sysctl.proc_translated was 0 inside sccache (https://developer.apple.com/documentation/apple-silicon/about-the-rosetta-translation-environment#Determine-Whether-Your-App-Is-Running-as-a-Translated-Binary). I guess it should be since we are running an arm64 sccache anyway.

I did notice the crossarch_trap in dtrace though, probably this is how OS selects the subprocess arch:

blaz.tomazic@Blazs-MacBook-Pro sccache-example % sudo dtruss -a ./sccache
    PID/THRD  RELATIVE  ELAPSD    CPU SYSCALL(args)          = return
 6034/0x13019:       142:        0:       0 fork()       = 0 0
 6034/0x13019:       121      29 18446744073709530 munmap(0x109A18000, 0x98000)      = 0 0
 6034/0x13019:       123       2      1 munmap(0x109AB0000, 0x8000)      = 0 0
 6034/0x13019:       125       1      1 munmap(0x109AB8000, 0x4000)      = 0 0
 6034/0x13019:       126       1      1 munmap(0x109ABC000, 0x4000)      = 0 0
 6034/0x13019:       127       3      0 munmap(0x109AC0000, 0x5C000)         = 0 0
 6034/0x13019:       141       1      0 crossarch_trap(0x0, 0x0, 0x0)        = -1 Err#45
 6034/0x13019:       154      18     11 open(".\0", 0x100000, 0x0)       = 3 0
 6034/0x13019:       157       2      1 fcntl(0x3, 0x32, 0x16B1232A8)        = 0 0
 6034/0x13019:       158       2      1 close(0x3)       = 0 0
 6034/0x13019:       163       6      3 fsgetpath(0x16B1232B8, 0x400, 0x16B123298)       = 53 0
 6034/0x13019:       176      13     12 fsgetpath(0x16B1232C8, 0x400, 0x16B1232A8)       = 14 0
 6034/0x13019:       180       1 18446744073709550 csrctl(0x0, 0x16B1236CC, 0x4)         = -1 Err#1
 6034/0x13019:       184       4      2 __mac_syscall(0x188A12948, 0x2, 0x16B123610)         = 0 0
 6034/0x13019:       185       0      0 csrctl(0x0, 0x16B1236EC, 0x4)        = -1 Err#1
 6034/0x13019:       188       3      2 __mac_syscall(0x188A0F79E, 0x5A, 0x16B123680)        = 0 0
 6034/0x13019:       196       6      0 sysctl([unknown, 3, 0, 0, 0, 0] (2), 0x16B122BE8, 0x16B122BE0, 0x188A113EF, 0xD)         = 0 0
 6034/0x13019:       197       1      0 sysctl([CTL_KERN, 137, 0, 0, 0, 0] (2), 0x16B122C98, 0x16B122C90, 0x0, 0x0)      = 0 0
 6034/0x13019:       212      10     10 open("/\0", 0x20100000, 0x0)         = 3 0
 6034/0x13019:       221      20      7 openat(0x3, "System/Cryptexes/OS\0", 0x100000, 0x0)      = 4 0
 6034/0x13019:       222       1      0 dup(0x4, 0x0, 0x0)       = 5 0
 6034/0x13019:       227       5      4 fstatat64(0x4, 0x16B122771, 0x16B1226E0)         = 0 0
 6034/0x13019:       232       9      4 openat(0x4, "System/Library/dyld/\0", 0x100000, 0x0)         = 6 0
 6034/0x13019:       233      24      0 fcntl(0x6, 0x32, 0x16B122770)        = 0 0
 6034/0x13019:       234       0      0 dup(0x6, 0x0, 0x0)       = 7 0
 6034/0x13019:       234       0      0 dup(0x5, 0x0, 0x0)       = 8 0
 6034/0x13019:       235       1      0 close(0x3)       = 0 0
 6034/0x13019:       235       0      0 close(0x5)       = 0 0
 6034/0x13019:       235       0      0 close(0x4)       = 0 0
 6034/0x13019:       236       0      0 close(0x6)       = 0 0
 6034/0x13019:       238       2      1 shared_region_check_np(0x16B122D80, 0x0, 0x0)        = 0 0
 6034/0x13019:       241       2      2 fsgetpath(0x16B1232D0, 0x400, 0x16B123228)       = 82 0
 6034/0x13019:       242       0      0 fcntl(0x8, 0x32, 0x16B1232D0)        = 0 0
 6034/0x13019:       242       0      0 close(0x8)       = 0 0
 6034/0x13019:       241       0      0 close(0x7)       = 0 0
 6034/0x13019:       248       3      2 getfsstat64(0x0, 0x0, 0x2)       = 11 0
 6034/0x13019:       255       7      7 getfsstat64(0x109C8EAA0, 0x5D28, 0x2)        = 11 0
 6034/0x13019:       260       5      4 getattrlist("/\0", 0x16B123200, 0x16B123170)         = 0 0
 6034/0x13019:       267       7      3 stat64("/System/Volumes/Preboot/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e\0", 0x16B123560, 0x0)      = 0 0
uname -m: arm64

arch: arm64
clang -dumpmachine: arm64-apple-darwin23.3.0

But I have no clue how to detect that aside from running a subprocess (like clang -dumpmachine) and checking the output.

Caching

I've also modified the makefile to show that the first compilation is always reused:

clean-cache:
        ./sccache --stop-server || true
        rm -rf /Users/blaz.tomazic/Library/Caches/Mozilla.sccache
version:
        clang --version
        ./sccache -- clang --version
build:
        clang -c main.c -o main-clang
        ./sccache -- clang -c main.c -o main-sccache
        file main-*

The output from commands is:

blaz.tomazic@Blazs-MacBook-Pro sccache-example % make clean-cache
./sccache --stop-server || true
Stopping sccache server...
sccache: error: couldn't connect to server
sccache: caused by: Connection refused (os error 61)
rm -rf /Users/blaz.tomazic/Library/Caches/Mozilla.sccache
blaz.tomazic@Blazs-MacBook-Pro sccache-example % make build             
clang -c main.c -o main-clang
./sccache -- clang -c main.c -o main-sccache
file main-*
main-clang:   Mach-O 64-bit object arm64
main-sccache: Mach-O 64-bit object arm64
blaz.tomazic@Blazs-MacBook-Pro sccache-example % arch -x86_64 make build
clang -c main.c -o main-clang
./sccache -- clang -c main.c -o main-sccache
file main-*
main-clang:   Mach-O 64-bit object x86_64
main-sccache: Mach-O 64-bit object arm64
blaz.tomazic@Blazs-MacBook-Pro sccache-example % make clean-cache       
./sccache --stop-server || true
Stopping sccache server...
Compile requests                      2
Compile requests executed             2
Cache hits                            1
Cache hits (C/C++)                    1
Cache misses                          1
Cache misses (C/C++)                  1
Cache timeouts                        0
Cache read errors                     0
Forced recaches                       0
Cache write errors                    0
Compilation failures                  0
Cache errors                          0
Non-cacheable compilations            0
Non-cacheable calls                   0
Non-compilation calls                 0
Unsupported compiler calls            0
Average cache write               0.000 s
Average compiler                  0.019 s
Average cache read hit            0.000 s
Failed distributed compilations       0
Cache location                  Local disk: "/Users/blaz.tomazic/Library/Caches/Mozilla.sccache"
Use direct/preprocessor mode?   yes
Version (client)                0.7.7
Cache size                          332 bytes
Max cache size                       10 GiB
rm -rf /Users/blaz.tomazic/Library/Caches/Mozilla.sccache
blaz.tomazic@Blazs-MacBook-Pro sccache-example % arch -x86_64 make build
clang -c main.c -o main-clang
./sccache -- clang -c main.c -o main-sccache
file main-*
main-clang:   Mach-O 64-bit object x86_64
main-sccache: Mach-O 64-bit object x86_64
blaz.tomazic@Blazs-MacBook-Pro sccache-example % make build             
clang -c main.c -o main-clang
./sccache -- clang -c main.c -o main-sccache
file main-*
main-clang:   Mach-O 64-bit object arm64
main-sccache: Mach-O 64-bit object x86_64