llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.7k stars 11.87k forks source link

InstComb assert with vector shift #4198

Closed llvmbot closed 15 years ago

llvmbot commented 15 years ago
Bugzilla Link 3826
Resolution FIXED
Resolved on Mar 18, 2009 18:45
Version unspecified
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @lattner

Extended Description

The IR code below asserts in ValueTracking.cpp:566 (the first line of llvm::ComputeNumSignBits) when using an instruction combining optimization pass:

define internal void @​0(<4 x i16>, <4 x i16>) { %3 = alloca <4 x i16>* ; <<4 x i16>> [#uses=2] %4 = alloca <4 x i16>* ; <<4 x i16>*> [#uses=2] store <4 x i16> %0, <4 x i16> %4 store <4 x i16> %1, <4 x i16> %3 %5 = load <4 x i16> %3 ; <<4 x i16>> [#uses=1] %6 = load <4 x i16>* %5, align 1 ; <<4 x i16>> [#uses=1] %7 = lshr <4 x i16> %6, <i16 5, i16 5, i16 5, i16 5> ; <<4 x i16>> [#uses=1] %8 = load <4 x i16>* %4 ; <<4 x i16>> [#uses=1] store <4 x i16> %7, <4 x i16>* %8, align 1 ret void }

The assert is about a cast to an incompatible type; as far as I can tell the function doesn't expect a vector as the second argument of lshr.

Furthermore, this generates very inefficient scalar code with four x86 shr instrutions, while it's supposed to be a perfect match for the MMX psrlw instruction. I suspect it doesn't work for any MMX/SSE vector shift.

lattner commented 15 years ago

Sure, but please ask on llvmdev, thanks!

llvmbot commented 15 years ago

Great, thanks for fixing this!

I'd love to give it a try to add support for matching MMX and SSE instructions (I'm quite experienced with them actually). I'm just not very familiar with LLVM's code, but because you say it's straightforward I'm eager to learn more about it. Got any advice on how to proceed? Thanks again.

lattner commented 15 years ago

Crash fixed here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090316/075388.html

Please include detailed information on how to reproduce the problem when you file a bug (e.g. saying to run instcombine on it). I don't think that the X86 backend has full support for vector shift matching. This is "documented" in lib/Target/X86/README-UNIMPLEMENTED.txt. If you're interested, it should be straight-forward to add.

llvmbot commented 15 years ago

Also, the output looks like this (when using lshr and a bunch of standard optimizations):

sub esp,44h movq mm0,mmword ptr ds:[3BC0010h] movq mmword ptr [esp],mm0 mov eax,dword ptr [esp+48h] movq mm1,mmword ptr [eax] movq mmword ptr [esp+8],mm1 movq mmword ptr [esp+10h],mm0 movq mmword ptr [esp+18h],mm1 movq mmword ptr [esp+20h],mm0 movq mmword ptr [esp+28h],mm1 movq mmword ptr [esp+30h],mm0 movq mmword ptr [esp+38h],mm1 mov cl,byte ptr [esp+6] mov dx,word ptr [esp+0Eh] shr dx,cl movd mm0,edx mov cl,byte ptr [esp+12h] mov dx,word ptr [esp+1Ah] shr dx,cl movd mm1,edx punpcklwd mm1,mm0 mov cl,byte ptr [esp+24h] mov dx,word ptr [esp+2Ch] shr dx,cl movd mm0,edx mov cl,byte ptr [esp+30h] mov dx,word ptr [esp+38h] shr dx,cl movd mm2,edx punpcklwd mm2,mm0 punpcklwd mm2,mm1 movq mmword ptr [eax],mm2 add esp,44h ret

Instead of something more like this:

mov eax, [esp+4] movq mm0, [eax] psrlw mm0, 5 movq [eax], mm0 ret

llvmbot commented 15 years ago

Fibonacci Sorry, that should be ashr instead of lshr. I've also attached a replacement of fibonacci.cpp that should reproduce the assert.

lattner commented 15 years ago

How do I reproduce this? Both of these work fine: $ llvm-as < t.ll | opt -instcombine $ llvm-as < t.ll | opt -std-compile-opts