llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.25k stars 12.08k forks source link

On 32-bit x86, `half` return ABI is incorrect when SSE is enabled but SSE2 is disabled #112890

Open beetrees opened 1 month ago

beetrees commented 1 month ago

Consider the following IR (compiler explorer):

target triple = "i586-unknown-linux-gnu"

define half @only_sse(half) #0 {
    ret half %0
}

attributes #0 = { "target-features"="+sse,-sse2" }

define half @sse_and_sse2(half) #1 {
    ret half %0
}

attributes #1 = { "target-features"="+sse,+sse2" }

The 32-bit x86 ABI for returning half is specified as using the xmm0 register. As both only_sse and sse_and_sse2 have SSE registers available, they should both be able to use the specified ABI. However, LLVM currently only compiles sse_and_sse2 correctly, with only_sse incorrectly returning the half in eax instead.

llvmbot commented 1 month ago

@llvm/issue-subscribers-backend-x86

Author: None (beetrees)

Consider the following IR ([compiler explorer](https://godbolt.org/z/nMrcsed1a)): ```llvm target triple = "i586-unknown-linux-gnu" define half @only_sse(half) #0 { ret half %0 } attributes #0 = { "target-features"="+sse,-sse2" } define half @sse_and_sse2(half) #1 { ret half %0 } attributes #1 = { "target-features"="+sse,+sse2" } ``` The 32-bit x86 ABI for returning `half` is specified as using the `xmm0` register. As both `only_sse` and `sse_and_sse2` have SSE registers available, they should both be able to use the specified ABI. However, LLVM currently only compiles `sse_and_sse2` correctly, with `only_sse` incorrectly returning the `half` in `eax` instead.
phoebewang commented 1 month ago

We don't have instructions to load/store half to xmm registers. Likely, we use GPR too when x87 is not usable https://godbolt.org/z/TE8c99aTK. The only problem is we should diagnose for them, I have a proposal to verify ABI, see #111690.

beetrees commented 1 month ago

We don't have instructions to load/store half to xmm registers.

AFAIK, the movss instruction is available when only sse is enabled and can be used to load and store values to and from xmm0 via the stack.

phoebewang commented 1 month ago

We don't have instructions to load/store half to xmm registers.

AFAIK, the movss instruction is available when only sse is enabled and can be used to load and store values to and from xmm0 via the stack.

If you mean by transiting through stack for each load/store, yes, we can do it in this way, but it's not worth the complexity. SSE was designed for float type only, even double is not supported without SSE2, not to mention half. Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK. Other front end should follow in the same way.

RalfJung commented 1 month ago

Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK

This doesn't show any error?

phoebewang commented 1 month ago

Both GCC and LLVM error out for this https://godbolt.org/z/TE8c99aTK

This doesn't show any error?

Sorry, here is the right link https://godbolt.org/z/GYETdPaEe