Closed Quuxplusone closed 4 years ago
Attached compiler_rt_armv8.ll.gz
(263601 bytes, application/gzip): compiler_rt_armv8.ll.gz
Attached c_armv8.ll.gz
(124427 bytes, application/gzip): c_armv8.ll.gz
Attached testarmf16.ll.gz
(529410 bytes, application/gzip): testarmf16.ll.gz
Oops, armv8 is a typo. It should be -target armv7a. Still repros the same though.
Thanks for the repro!
Bisection points to the same commit as Bug 47001:
commit a255931c40558edf87994c2a8ed9b274c3fbda30
Author: Lucas Prates lucas.prates@arm.com
Date: Tue Jun 9 09:45:47 2020 +0100
[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend
Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.
Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.
This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.
Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer
Reviewed By: ostannard
Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75169
But as you point out, the fix for that (3d943bcd223e on trunk / ba3413982cbd7a5b5aeaf2ea34e0a91d5561202d on the 11.x branch) doesn't fix this new test case.
Lucas, can you take a look? Since we're now past the release date it would be great to get this fixed as soon as possible.
Given the commit message about half-precision floats not really being supported by llvm before, I wonder how this worked for Zig in previous releases.
Andrew, if a fix for this doesn't materialize soon, would it be possible for Zig to work around this by promoting arguments and return values to 32-bit float?
I will give that a try - thanks for the suggestion.
Sadly it doesn't seem like there's any progress here. I don't think we can hold the release for this.
Hi Hans,
From what I was able to check while debugging, it seems the issue is related to
the lack of Calling Convention information for one of the parameters in the
call to `expectEqual`.
I'm investigating the reason for the missing information and I believe I'm
close to having a fix. I understand the release we're past the release date,
though.
Quick update: I have a very small patch that fixes the issue.
I'm currently working on a couple of tests for it and should publish it on Phab
soon.
Patch for review: https://reviews.llvm.org/D87844.
Fixed by commit 53d238a961d1:
>$ clang -o test testarmf16.ll -nostartfiles -nodefaultlibs \
> -target armv7a-unknown-linux-unknown compiler_rt_armv8.ll c_armv8.ll \
> -fuse-ld=/usr/bin/ld.lld -static -Xclang -target-feature -Xclang -32bit -
Xclang \
> -target-feature -Xclang -8msecext -Xclang -target-feature -Xclang -a76 -
Xclang \
> -target-feature -Xclang +aclass -Xclang -target-feature -Xclang \
> -acquire-release -Xclang -target-feature -Xclang -aes -Xclang -target-
feature \
> -Xclang -avoid-movs-shop -Xclang -target-feature -Xclang -avoid-partial-cpsr
\
> -Xclang -target-feature -Xclang -bf16 -Xclang -target-feature -Xclang -cde \
> -Xclang -target-feature -Xclang -cdecp0 -Xclang -target-feature -Xclang -
cdecp1 \
> -Xclang -target-feature -Xclang -cdecp2 -Xclang -target-feature -Xclang -
cdecp3 \
> -Xclang -target-feature -Xclang -cdecp4 -Xclang -target-feature -Xclang -
cdecp5 \
> -Xclang -target-feature -Xclang -cdecp6 -Xclang -target-feature -Xclang -
cdecp7 \
> -Xclang -target-feature -Xclang -cheap-predicable-cpsr -Xclang -target-
feature \
> -Xclang -crc -Xclang -target-feature -Xclang -crypto -Xclang -target-feature
\
> -Xclang +d32 -Xclang -target-feature -Xclang +db -Xclang -target-feature \
> -Xclang -dfb -Xclang -target-feature -Xclang -disable-postra-scheduler -
Xclang \
> -target-feature -Xclang -dont-widen-vmovs -Xclang -target-feature -Xclang \
> -dotprod -Xclang -target-feature -Xclang +dsp -Xclang -target-feature -
Xclang \
> -execute-only -Xclang -target-feature -Xclang -expand-fp-mlx -Xclang \
> -target-feature -Xclang -exynos -Xclang -target-feature -Xclang -fp16 -
Xclang \
> -target-feature -Xclang -fp16fml -Xclang -target-feature -Xclang +fp64 -
Xclang \
> -target-feature -Xclang -fp-armv8 -Xclang -target-feature -Xclang -fp-
armv8d16 \
> -Xclang -target-feature -Xclang -fp-armv8d16sp -Xclang -target-feature -
Xclang \
> -fp-armv8sp -Xclang -target-feature -Xclang -fpao -Xclang -target-feature \
> -Xclang +fpregs -Xclang -target-feature -Xclang -fpregs16 -Xclang \
> -target-feature -Xclang +fpregs64 -Xclang -target-feature -Xclang -fullfp16 \
> -Xclang -target-feature -Xclang -fuse-aes -Xclang -target-feature -Xclang \
> -fuse-literals -Xclang -target-feature -Xclang +v4t -Xclang -target-feature \
> -Xclang +v5t -Xclang -target-feature -Xclang +v5te -Xclang -target-feature \
> -Xclang +v6 -Xclang -target-feature -Xclang +v6k -Xclang -target-feature \
> -Xclang +v6m -Xclang -target-feature -Xclang +v6t2 -Xclang -target-feature \
> -Xclang +v7 -Xclang -target-feature -Xclang +v7clrex -Xclang -target-feature
\
> -Xclang -v8.1a -Xclang -target-feature -Xclang -v8.1m.main -Xclang \
> -target-feature -Xclang -v8.2a -Xclang -target-feature -Xclang -v8.3a -
Xclang \
> -target-feature -Xclang -v8.4a -Xclang -target-feature -Xclang -v8.5a -
Xclang \
> -target-feature -Xclang -v8.6a -Xclang -target-feature -Xclang -v8 -Xclang \
> -target-feature -Xclang +v8m -Xclang -target-feature -Xclang -v8m.main -
Xclang \
> -target-feature -Xclang -hwdiv -Xclang -target-feature -Xclang -hwdiv-arm \
> -Xclang -target-feature -Xclang -i8mm -Xclang -target-feature -Xclang -
iwmmxt \
> -Xclang -target-feature -Xclang -iwmmxt2 -Xclang -target-feature -Xclang -
lob \
> -Xclang -target-feature -Xclang -long-calls -Xclang -target-feature -Xclang \
> -loop-align -Xclang -target-feature -Xclang -m3 -Xclang -target-feature -
Xclang \
> -mclass -Xclang -target-feature -Xclang -mp -Xclang -target-feature -Xclang \
> -muxed-units -Xclang -target-feature -Xclang -mve -Xclang -target-feature \
> -Xclang -mve.fp -Xclang -target-feature -Xclang -mve1beat -Xclang \
> -target-feature -Xclang -mve2beat -Xclang -target-feature -Xclang -mve4beat \
> -Xclang -target-feature -Xclang -nacl-trap -Xclang -target-feature -Xclang \
> +neon -Xclang -target-feature -Xclang -neon-fpmovs -Xclang -target-feature \
> -Xclang -neonfp -Xclang -target-feature -Xclang -no-branch-predictor -Xclang
\
> -target-feature -Xclang -no-movt -Xclang -target-feature -Xclang \
> -no-neg-immediates -Xclang -target-feature -Xclang -noarm -Xclang \
> -target-feature -Xclang -nonpipelined-vfp -Xclang -target-feature -Xclang \
> +perfmon -Xclang -target-feature -Xclang -prefer-ishst -Xclang -target-
feature \
> -Xclang -prefer-vmovsr -Xclang -target-feature -Xclang -prof-unpr -Xclang \
> -target-feature -Xclang -r4 -Xclang -target-feature -Xclang -ras -Xclang \
> -target-feature -Xclang -rclass -Xclang -target-feature -Xclang -read-tp-
hard \
> -Xclang -target-feature -Xclang -reserve-r9 -Xclang -target-feature -Xclang \
> -ret-addr-stack -Xclang -target-feature -Xclang -sb -Xclang -target-feature \
> -Xclang -sha2 -Xclang -target-feature -Xclang -slow-fp-brcc -Xclang \
> -target-feature -Xclang -slow-load-D-subreg -Xclang -target-feature -Xclang \
> -slow-odd-reg -Xclang -target-feature -Xclang -slow-vdup32 -Xclang \
> -target-feature -Xclang -slow-vgetlni32 -Xclang -target-feature -Xclang \
> -slowfpvfmx -Xclang -target-feature -Xclang -slowfpvmlx -Xclang -target-
feature \
> -Xclang -soft-float -Xclang -target-feature -Xclang -splat-vfp-neon -Xclang \
> -target-feature -Xclang -strict-align -Xclang -target-feature -Xclang -swift
\
> -Xclang -target-feature -Xclang +thumb2 -Xclang -target-feature -Xclang \
> -thumb-mode -Xclang -target-feature -Xclang -trustzone -Xclang -target-
feature \
> -Xclang -use-misched -Xclang -target-feature -Xclang -armv2 -Xclang \
> -target-feature -Xclang -armv2a -Xclang -target-feature -Xclang -armv3 -
Xclang \
> -target-feature -Xclang -armv3m -Xclang -target-feature -Xclang -armv4 -
Xclang \
> -target-feature -Xclang -armv4t -Xclang -target-feature -Xclang -armv5t -
Xclang \
> -target-feature -Xclang -armv5te -Xclang -target-feature -Xclang -armv5tej \
> -Xclang -target-feature -Xclang -armv6 -Xclang -target-feature -Xclang -
armv6j \
> -Xclang -target-feature -Xclang -armv6k -Xclang -target-feature -Xclang \
> -armv6kz -Xclang -target-feature -Xclang -armv6-m -Xclang -target-feature \
> -Xclang -armv6s-m -Xclang -target-feature -Xclang -armv6t2 -Xclang \
> -target-feature -Xclang +armv7-a -Xclang -target-feature -Xclang -armv7e-m \
> -Xclang -target-feature -Xclang -armv7k -Xclang -target-feature -Xclang \
> -armv7-m -Xclang -target-feature -Xclang -armv7-r -Xclang -target-feature \
> -Xclang -armv7s -Xclang -target-feature -Xclang -armv7ve -Xclang \
> -target-feature -Xclang -armv8-a -Xclang -target-feature -Xclang -armv8-
m.base \
> -Xclang -target-feature -Xclang -armv8-m.main -Xclang -target-feature -
Xclang \
> -armv8-r -Xclang -target-feature -Xclang -armv8.1-a -Xclang -target-feature \
> -Xclang -armv8.1-m.main -Xclang -target-feature -Xclang -armv8.2-a -Xclang \
> -target-feature -Xclang -armv8.3-a -Xclang -target-feature -Xclang -armv8.4-
a \
> -Xclang -target-feature -Xclang -armv8.5-a -Xclang -target-feature -Xclang \
> -armv8.6-a -Xclang -target-feature -Xclang +vfp2 -Xclang -target-feature \
> -Xclang +vfp2sp -Xclang -target-feature -Xclang +vfp3 -Xclang -target-
feature \
> -Xclang +vfp3d16 -Xclang -target-feature -Xclang +vfp3d16sp -Xclang \
> -target-feature -Xclang +vfp3sp -Xclang -target-feature -Xclang -vfp4 -
Xclang \
> -target-feature -Xclang -vfp4d16 -Xclang -target-feature -Xclang -vfp4d16sp \
> -Xclang -target-feature -Xclang -vfp4sp -Xclang -target-feature -Xclang \
> -virtualization -Xclang -target-feature -Xclang -vldn-align -Xclang \
> -target-feature -Xclang -vmlx-forwarding -Xclang -target-feature -Xclang \
> -vmlx-hazards -Xclang -target-feature -Xclang -wide-stride-vfp -Xclang \
> -target-feature -Xclang -xscale -Xclang -target-feature -Xclang -zcz
>$ qemu-arm ./test
>$ echo $?
>0
Keep this open until merged into 11.0
Cherry-picked to 11.x as b513e1963f3a7edc897c6c4e675934d0c58f1802.
Many thanks!
compiler_rt_armv8.ll.gz
(263601 bytes, application/gzip)c_armv8.ll.gz
(124427 bytes, application/gzip)testarmf16.ll.gz
(529410 bytes, application/gzip)