Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[AARCH64] vector zext hangs/crashes with NEON disabled #42689

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR43719
Status NEW
Importance P normal
Reported by Devin Hussey (husseydevin@gmail.com)
Reported on 2019-10-18 22:30:10 -0700
Last modified on 2019-10-18 22:53:11 -0700
Version trunk
Hardware Other All
CC arnaud.degrandmaison@arm.com, husseydevin@gmail.com, llvm-bugs@lists.llvm.org, smithp352@googlemail.com, Ties.Stuij@arm.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
define @zext_v2xi32(<2 x i32> %vec)
{
    %extended = zext <2 x i32> %vec to <2 x i64>
    ret <2 x i64> %extended
}

llc -march=aarch64 -mattr=-neon

I would expect something like this (if I am correct in that <2 x i32> is passed
as i32 *):

zext_v2xi32:
    ldp   w0, w1, [x0]
    ret

But instead, I either get a segfault (on Termux's 8.0.1 build), or an infinite
loop (9.0.0 on archlinuxarm or trunk w/debug). With every other known
architecture, SIMD types still scalarize properly with SIMD disabled, given
that llvm intrinsics aren't used.

Enabling debug output in llc spams this:

Combining: t7: i64 = Constant<0>

Combining: t6: v2i64 = zero_extend t5
Creating new node: tNNNNN: v1i64 = extract_subvector t6, Constant:i64<0>
Creating new node: tNNNNN+1: v1i64 = extract_subvector t6, Constant:i64<1>
Quuxplusone commented 5 years ago
Typo:  If it wasn't clear, the code should be this.

define <2 x i64> @zext_v2xi32(<2 x i32> %vec)
{
    %extended = zext <2 x i32> %vec to <2 x i64>
    ret <2 x i64> %extended
}
Quuxplusone commented 5 years ago
And the assembly should probably output this

zext_v2i32:
     mov     w0, w0
     mov     w1, w1
     ret