BLAKE3-team / BLAKE3

the official Rust and C implementations of the BLAKE3 cryptographic hash function
Apache License 2.0
5.04k stars 344 forks source link

[Bug Report] BLAKE3 Fails To Compile On Linux For No Apparent Reason #427

Open CalinZBaenen opened 3 days ago

CalinZBaenen commented 3 days ago

So, I've been using the Bevy game-engine to work on some fun game ideas I've had – but, after coming back from a long break ov not programming with Bevy, I found out that I could not compile it due to the following error:

  error occurred: Command "cc" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-gdwarf-4" "-fno-omit-frame-pointer" "-m64" "-Wall" "-Wextra" "-std=c11" "-o" "/home/calin/Projects/Vessie/target/debug/build/blake3-10ff633bb8fa3e2c/out/db3b6bfb95261072-blake3_sse2_x86-64_unix.o" "-c" "c/blake3_sse2_x86-64_unix.S" with args cc did not execute successfully (status code exit status: 1).

I have not experienced this issue before, and the last project I, successfully, built with Bevy was in the version that introduced BLAKE3. — Someone recommended running cargo clean then rebuilding, but that did not work.

I am using Ubuntu 22.04.4 with Wayland, natively, and I am compiling for the same system.
Here is a shared Google Drive folder sharing the very primitive code I have, to try and replicate the bug: https://drive.google.com/drive/folders/1zx6BmUN7HQKXIgMKCH_F84uPCbJoVpmD.

Please let me know if anymore information is needed.

oconnor663 commented 3 days ago

I'm not able to reproduce the error you're describing from the files you shared in the Google Drive folder. (I see errors like "cannot find type Scene in this scope" in your read.rs, but those don't seem relevant.) I'm on x86_64 Arch Linux running Rust 1.81.

You included the command that failed above, but you didn't include the error output of that command. Can you repeat the build and see if you can spot the error output? If not, can you include a complete capture of all the build output, so that I can see if I can spot it.

Other things that we might need to know:

oconnor663 commented 3 days ago

Another point to clarify: Is there anything specific about your own code that's related to the build failure? Or can you trigger the failure purely by building bevy? Specifically, can you trigger the failure this way:

$ cargo new example
...                                
$ cd example                                                             
$ cargo add bevy
...
$ cargo build
...

When I do those steps on my machine, it takes quite a long time to build :) but I don't see any errors.

CalinZBaenen commented 2 days ago

You included the command that failed above, but you didn't include the error output of that command.

Is that not what's in the codeblock I provided? — There wasn't any information proceeding that chunk ov text.

The preceding information, with CARGO_PROFILE_DEV_BUILD_OVERRIDE_DEBUG set to true – as recommended by the compiler – looks like this:

The following warnings were emitted during compilation:

warning: blake3@1.5.4: Compiler family detection failed due to error: ToolExecError: Command "cc" "-E" "/home/calin/Projects/Vessie/target/debug/build/blake3-75236400a1f88d00/out/16036542310283992667detect_compiler_family.c" with args cc did not execute successfully (status code exit status: 1).
warning: blake3@1.5.4: Compiler family detection failed due to error: ToolExecError: Command "cc" "-E" "/home/calin/Projects/Vessie/target/debug/build/blake3-75236400a1f88d00/out/5889472015475158423detect_compiler_family.c" with args cc did not execute successfully (status code exit status: 1).
warning: blake3@1.5.4: Compiler family detection failed due to error: ToolExecError: Command "cc" "-E" "/home/calin/Projects/Vessie/target/debug/build/blake3-75236400a1f88d00/out/12487841639296948905detect_compiler_family.c" with args cc did not execute successfully (status code exit status: 1).
warning: blake3@1.5.4: The C compiler "cc" does not support -mavx512f and -mavx512vl.
warning: blake3@1.5.4: Compiler family detection failed due to error: ToolExecError: Command "cc" "-E" "/home/calin/Projects/Vessie/target/debug/build/blake3-75236400a1f88d00/out/3418268258640279083detect_compiler_family.c" with args cc did not execute successfully (status code exit status: 1).
warning: blake3@1.5.4: cc1: error: /usr/local/include/x86_64-linux-gnu: Permission denied
warning: blake3@1.5.4: cc1: error: /usr/local/include: Permission denied

error: failed to run custom build command for `blake3 v1.5.4`

Caused by:
  process didn't exit successfully: `/home/calin/Projects/Vessie/target/debug/build/blake3-d2f3a51f26eb2411/build-script-build` (exit status: 1)

/* ... */

  --- stderr

  error occurred: Command "cc" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-gdwarf-4" "-fno-omit-frame-pointer" "-m64" "-Wall" "-Wextra" "-std=c11" "-o" "/home/calin/Projects/Vessie/target/debug/build/blake3-75236400a1f88d00/out/db3b6bfb95261072-blake3_sse2_x86-64_unix.o" "-c" "c/blake3_sse2_x86-64_unix.S" with args cc did not execute successfully (status code exit status: 1).

(You probably already could tell, but no, /* ... */ is not part ov the output – I am just skipping a lot ov, what I think is useless, debug info to save on readingtime.)

  • Your Rust version.
cargo 1.80.0 (376290515 2024-07-16)
rustc 1.80.0 (051478957 2024-07-21)
  • Your C compiler version.
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
  • Your CPU architecture.

x86_64

  • Any relevant environment variables you might have set, like CC.

Neither GCC nor CC are defined (which, to reiterate, has been okay before) — I am not aware ov any environment variables that could have been set or unset that would have caused this, as I don't find myself modifying them (manually).

Another point to clarify: Is there anything specific about your own code that's related to the build failure?

No. — Why would there be?
After all, I'm not directly using your crate; besides, the code that runs into this error literally only has a trait, which would have compiled if Bevy were not dependency ov the program.

Specifically, can you trigger the failure this way:

$ cargo new example
...                                
$ cd example                                                             
$ cargo add bevy
...
$ cargo build
...

Yes.
I get the exact same output for, presumably, the exact same reason.

CalinZBaenen commented 2 days ago

If any further information is needed, please let me know.

This is a very perplexing bug.

oconnor663 commented 2 days ago

Those Permission denied errors in your output above are very suspicious. What the build is trying to do there is to compile an empty or nearly empty C file, just to see what compiler flags are supported. If that's running into permissions errors, that's pretty surprising. If you create a file like /tmp/test.c:

int main() {
    return 0;
}

And then try to compile it like this:

$ cd /tmp
$ cc test.c

Do you see any output? For me that command succeeds with no output (and creates /tmp/a.out).

CalinZBaenen commented 1 day ago

This, for some reason, still causes permission errors... with no apparent explanation.

cc1: error: /usr/local/include/x86_64-linux-gnu: Permission denied
cc1: error: /usr/local/include: Permission denied

(I ran this test with both cc and gcc and got the same results.)

I am able to run it with sudo — I would say this usually suggests a program is misconfigured, or erroneously installed using sudo; however, as far as I'm aware, CC/GCC comes with Ubuntu, and consequently *should* not have such an error – so, I'm not really sure what the issue could be.

[Note:  the compiled file is owned by root (and has the lock symbol on it), but users with normal privileges can still run it normally.]

While I am not quite sure what's causing this behavior, it seems to me that this is more likely a bug with CC/GCC than with BLAKE3.
(However, personally, I would refrain from closing the issue until I can verify the Rust code can be compiled.)

oconnor663 commented 1 day ago

Yeah this is definitely a "can't compile C for some reason" issue and not really a BLAKE3 issue, but now you've got me curious what the heck's going on :)

It looks like you might've installed some sort of custom compiler, presumably not through your system package manager, but maybe through a make install command or similar? The reason I say that is that /usr/local/ isn't where the system compiler goes. You probably have a system compiler at /usr/bin/cc or similar, but then /usr/local takes priority in your $PATH. And then my next guess is that whatever custom install process that was, it left those directories in a not-publicly-readable state, which you might be able to see if say ls -l /usr/local/include doesn't show universal read permissions. There's a decent change that a chmod -R a+r on /usr/local will fix everything, but I want to be careful suggesting big sweeping commands when your system's in a sketchy state...

CalinZBaenen commented 1 day ago

It looks like you might've installed some sort of custom compiler, presumably not through your system package manager,

No(?).

The problem with this is... I have no recollection ov installing such a custom compiler — I only ever remember installing Ubuntu, Cargo, and the rest ov the Rust toolchain, which is immensely confusing if I did install a custom version ov CC/GCC.
Furthermore, Ubuntu does come with GCC, according to a quick glance at the search-results provided by Google, meaning I probably would have had to resolve some conflicts. (Right?)

You probably have a system compiler at /usr/bin/cc or similar, but then /usr/local takes priority in your $PATH.

Weird, /usr/local/include is not a path in my PATH environment variable, only subdirectories ov it are.

My PATH env-var – formatted, for your sake –looks like this:

/home/calin/.nvm/versions/node/v21.7.3/bin
/home/calin/.cargo/bin
/home/calin/.local/bin
/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games
/usr/local/games
/snap/bin
/home/calin/.deno/bin/
/snap/bin

(I'm not quite sure why /snap/bin is in PATH twice. nor am I sure about any consequences that might incur.)

Another oddity is that the only item in /usr/local/include – according to Nautilus ran with sudo – is a folder called node, which I am assuming is in reference to Node and Node Package Manager (NPM), which is totally unrelated to C, GNU, CC, or GCC.

And then my next guess is that whatever custom install process that was, it left those directories in a not-publicly-readable state, which you might be able to see if say ls -l /usr/local/include doesn't show universal read permissions.

It seems like /usr/local/include is owned by root, and the same, seemingly, applies to its parent directory.
The command output look like this:

total 4
drwxr-xr-x 6 root root 4096 Apr 16 16:59 node

There's a decent change that a chmod -R a+r on /usr/local will fix everything, but I want to be careful suggesting big sweeping commands when your system's in a sketchy state...

Should /usr/local, and its subdirectories, not be owned by root (and/or sudo)?
If not, how would you recommend I get my system out ov this "sketchy state"? — The way my machine is operating now does not seem to be conducive ov an efficient programming cycle.