rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.63k stars 12.49k forks source link

Enable `f16` and `f128` in assembly on platforms that support it #125398

Open tgross35 opened 3 months ago

tgross35 commented 3 months ago

The below should work, but errors that f16 is not usable for registers:

#![feature(f16, f128)]

use core::arch::asm;

#[inline(never)]
pub fn f32_to_f16(a: f32) -> f16 {
    a as f16
}

#[inline(never)]
pub fn f32_to_f16_asm(a: f32) -> f16 {
    let ret: f16;
    unsafe {
        asm!(
                "fcvt    {ret:h}, {a:s}",
                a = in(vreg) a,
                ret = lateout(vreg) ret,
                options(nomem, nostack),
        );
    }

    ret
}

On aarch64 the first function generates:

example::f32_to_f16::hc897184dfb47f3d6:
        fcvt    h0, s0
        ret

f16 types should be supported as a vreg on aarch64 in order to reproduce that code.


The following other platforms also apparently have some level of instruction support, but are less well documented:

Additionally, for f128:

Tracking issue: https://github.com/rust-lang/rust/issues/116909

tgross35 commented 3 months ago

I'm adding E-Easy because a PR that just enables support for aarch64 should be pretty easy, start around https://github.com/rust-lang/rust/blob/b54dd08a84f3c07efbc2aaf63c3df219ae680a03/compiler/rustc_hir_analysis/src/check/intrinsicck.rs#L65-L66 and massage the new types in. Actually figuring out rules for the rest of the platforms will be harder, but that can come later.

Sample for reference: https://rust.godbolt.org/z/zK4qha1qo

@rustbot label +T-compiler +E-Easy +F-f16_and_f128 +A-inline-assembly -needs-triage

lengrongfu commented 3 months ago

@tgross35 I can try to submit a PR, can you give me some guidance?

tgross35 commented 3 months ago

Hi @lengrongfu, thanks for the interest!

This should be pretty easy I think. Start by making a test in tests/ui/asm/ that contains the assembly function from my original post. Make sure this fails when you run ./x t --stage 1 path/to/your/new/test.rs.

Then just find where the error is emitted (search the codebase for "cannot use value of type") and work backwards from that until the test passes. This will probably mean adding F16 to InlineAsmType and then chasing down errors.

We will need to make sure that this works on platforms with support (e.g. aarch64) but still fails on those without it (e.g. x86). Just focus on getting aarch64 to build first.

There is a compiler help stream on Zulip https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp feel free to ask if you get stuck! Also not a bad idea to post a draft PR as soon as you have some basic work done, even if not yet passing.