Closed futile closed 6 months ago
The _mm_add_ss
intrinsic isn't yet implemented. It is used by the glam linear algebra crate which bevy uses internally. Looks like there are a fair bit of unimplemented intrinsics used by glam. I'm going to work on implementing them.
compile time for my project (for a full debug-build) went from ~4min to ~1min when enabling cranelift!
How much of a difference is it if you remove https://github.com/futile/ultra-game/blob/831eeb43b56c1d9dbc9422d130095cd14da8e145/Cargo.toml#L13-L15? Anything opt-level > 0 is kind of equivalent to opt-level = 1 with LLVM in terms of optimizations. (I haven't actually measured the runtime performance difference, but Cranelift doesn't have a lot of optimizations it supports, so it is far from as fast in terms of runtime perf as opt-level = 3 with LLVM.) If you want dependencies to be fully optimized you did have to build them with LLVM and then build just your own code with cg_clif. For Bevy this currently doesn't work due to an ABI incompatibility though: https://github.com/rust-lang/rustc_codegen_cranelift/issues/1449
Turns out there were only three intrinsics missing.
Turns out there were only three intrinsics missing.
Oh wow, that was super fast, thanks a lot! :) Can I somehow test this/should I just test the next 1-2 nightly rustc versions?
You can download a precompiled version from https://github.com/rust-lang/rustc_codegen_cranelift/releases/tag/dev Unpack it anywhere you like and use the cargo-clif executable inside it in the place of cargo.
It may take a couple of days before I'm able to update the version distributed with rustup.
How much of a difference is it if you remove https://github.com/futile/ultra-game/blob/831eeb43b56c1d9dbc9422d130095cd14da8e145/Cargo.toml#L13-L15? Anything opt-level > 0 is kind of equivalent to opt-level = 1 with LLVM in terms of optimizations. (I haven't actually measured the runtime performance difference, but Cranelift doesn't have a lot of optimizations it supports, so it is far from as fast in terms of runtime perf as opt-level = 3 with LLVM.)
Ah good point! Yeah, running without opt-level > 0 (i.e., commenting out what you mentioned, and also opt-level = 1
for dev
) changes the times to 56s with cranelift, and 65s with LLVM, so pretty much equal. Well, still ~10% faster, but much less of a difference than before :sweat:
If you want dependencies to be fully optimized you did have to build them with LLVM and then build just your own code with cg_clif. For Bevy this currently doesn't work due to an ABI incompatibility though: #1449
Cool, thanks for the tip, will keep it in mind & subscribed! :)
Can I somehow test this/should I just test the next 1-2 nightly rustc versions?
The fix will be available on the next nightly.
After compiling my bevy-project with the new cranelift backend, using the nightly rustc from 2024-03-01, I get the following crash due to
trap
when running it:Without
RUSTFLAGS="-Zcodegen-backend=cranelift"
it runs fine (e.g., theDRM kernel driver ...
-message isn't critical).To reproduce: Run the failing command with this repo & commit: https://github.com/futile/ultra-game/tree/831eeb43b56c1d9dbc9422d130095cd14da8e145
This cranelift backend is a great project, compile time for my project (for a full debug-build) went from ~4min to ~1min when enabling cranelift! Really cool, thanks a lot for your work! :)