Open brunocaballero opened 5 months ago
this is a known issue but I don't know how to fix it
I found two ways to fix this. But before this, I modified the sample above slightly in order to avoid optimizations ruining what we're trying to test here. Though, even if the sample above is used as is, the fix below still works.
use gemm_f16::f16;
fn main() {
println!("Hello, fp16!");
let a = core::hint::black_box(f16::from_f32(3.1f32));
let b = core::hint::black_box(f16::from_f32(2.2f32));
let c = core::hint::black_box(|| a * b)();
if c.is_normal() {
println!("Is normal!");
}
println!("Result {c}")
}
I started by creating the file .cargo/config.toml
in order to tell cargo
that I'm targeting AArch64:
[build]
# $ rustup target add aarch64-unknown-linux-musl
target = "aarch64-unknown-linux-musl"
And then I added the following section to that file:
[target.aarch64-unknown-linux-musl]
linker = "clang"
rustflags = [
"-Clink-arg=--target=aarch64-unknown-linux-musl",
"-Clink-arg=-fuse-ld=lld",
"-Ctarget-feature=+fp16,+fhm"
]
If the version of clang
/lld
installed on the system is too old, then download and extract a recent clang
/lld
toolchain somewhere, and use the following instead:
[target.aarch64-unknown-linux-musl]
linker = "clang-18"
rustflags = [
"-Clink-arg=--target=aarch64-unknown-linux-musl",
"-Clink-arg=-fuse-ld=lld-18",
"-Ctarget-feature=+fp16,+fhm"
]
If gcc
is preferred, then download and extract a recent cross-compilation gcc
toolchain for AArch64 somewhere, and use the following instead:
linker = "<somewhere>/arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc"
rustflags = [ "-Ctarget-feature=+fp16,+fhm" ]
Once that is done, running cargo build
and cargo build --release
both succeed, and disassembling the binaries shows (among others) the instructions:
0000000000224f08 <half::binary16::arch::aarch64::multiply_f16_fp16>:
...
224f14: 1e270000 fmov s0, w0
224f18: 1e204001 fmov s1, s0
224f1c: 1e270020 fmov s0, w1
224f20: 1e204002 fmov s2, s0
224f24: 1ee20820 fmul h0, h1, h2
...
and also:
0000000000224adc <fp16::main>:
...
224b44: 7d400100 ldr h0, [x8]
224b48: 7d400121 ldr h1, [x9]
224b4c: 1ee10802 fmul h2, h0, h1
224b50: 1e260048 fmov w8, s2
...
Running the binaries through QEmu shows:
$ target/aarch64-unknown-linux-musl/debug/fp16
Hello, fp16!
Is normal!
Result 6.8164063
$ target/aarch64-unknown-linux-musl/release/fp16
Hello, fp16!
Is normal!
Result 6.8164063
For reference, I found the feature names fp16
and fhm
(specified above in -Ctarget-feature=+fp16,+fhm
) through the following command:
$ rustc --target=aarch64-unknown-linux-musl --print target-features
Features supported by rustc for this target:
...
fhm - Enable FP16 FML instructions (FEAT_FHM).
flagm - Enable v8.4-A Flag Manipulation Instructions (FEAT_FlagM).
fp16 - Full FP16 (FEAT_FP16).
...
Raspberry Pi 5 with Raspberry Pi OS 64bits (Debian 12 bookworm) / Rust 1.80, the solution also works.
.cargo/config.toml
[build]
rustflags = [
"-Ctarget-feature=+fp16,+fhm"
]
Thank you for sharing.
Resolved it for my specific scenario:
In src-tauri/Cargo.toml
I had to:
[profile.dev]
rustflags = ["-C", "target-feature=+fp16,+fhm"]
cargo-features = ["profile-rustflags"]
rustup toolchain install nightly
rustup override set nightly
rustup target add aarch64-linux-android
bun run tauri android dev
Boom - android/aarch64 compiled with gemm-fp16!
Chiming here to say I got the same issue compiling for aarch64-pc-windows-msvc
, and one of the workarounds mentioned on https://github.com/sarah-quinones/gemm/issues/31#issuecomment-2254635277 above worked:
Under .cargo/config.toml
:
[build]
rustflags = [
"-Ctarget-feature=+fp16,+fhm"
]
Hi,
I created a small Rust example:
Building in release mode for target AArch64/Linux works, but it fails when building in debug mode.
error: instruction requires: fullfp16
But I am not sure in which context fillfp16 is not supported, maybe the llvm toolchain?