Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

unsigned int getFPReg(const llvm::MachineOperand&): Assertion `Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!"' fa #10766

Open Quuxplusone opened 13 years ago

Quuxplusone commented 13 years ago
Bugzilla Link PR10498
Status CONFIRMED
Importance P normal
Reported by Nadav Rotem (nadav.rotem@me.com)
Reported on 2011-07-26 06:05:35 -0700
Last modified on 2020-06-26 13:37:16 -0700
Version trunk
Hardware PC Windows NT
CC arnd@linaro.org, baldrick@free.fr, craig.topper@gmail.com, efriedma@quicinc.com, francisvm@yahoo.com, george.burgess.iv@gmail.com, ilia.taraban@intel.com, llozano@chromium.org, llvm-bugs@lists.llvm.org, manojgupta@google.com, ndesaulniers@google.com, srhines@google.com, stoklund@2pi.dk, tstellar@redhat.com
Fixed by commit(s)
Attachments x86-floating-point-crash.ll (2563 bytes, text/plain)
Blocks
Blocked by
See also PR30426, PR41668
llc: X86FloatingPoint.cpp:316: unsigned int getFPReg(const
llvm::MachineOperand&): Assertion `Reg >= X86::FP0 && Reg <= X86::FP6 &&
"Expected FP register!"' failed.
0  llc             0x000000000153ffea
1  llc             0x0000000001540578
2  libpthread.so.0 0x00002aaaaabd4d60
3  libc.so.6       0x00002aaaab274f45 gsignal + 53
4  libc.so.6       0x00002aaaab276340 abort + 272
5  libc.so.6       0x00002aaaab26e486 __assert_fail + 246
6  llc             0x0000000000e542c8
7  llc             0x0000000000e5606a (anonymous
namespace)::FPS::handleSpecialFP(llvm::ilist_iterator<llvm::MachineInstr>&) +
1412
8  llc             0x0000000000e57f27 (anonymous
namespace)::FPS::processBasicBlock(llvm::MachineFunction&,
llvm::MachineBasicBlock&) + 885
9  llc             0x0000000000e58459 (anonymous
namespace)::FPS::runOnMachineFunction(llvm::MachineFunction&) + 341
10 llc             0x00000000011449ad
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 85
11 llc             0x000000000147a2bb
llvm::FPPassManager::runOnFunction(llvm::Function&) + 371
12 llc             0x000000000147a4b3
llvm::FPPassManager::runOnModule(llvm::Module&) + 81
13 llc             0x0000000001479f67
llvm::MPPassManager::runOnModule(llvm::Module&) + 381
14 llc             0x000000000147b71c llvm::PassManagerImpl::run(llvm::Module&)
+ 116
15 llc             0x000000000147b77f llvm::PassManager::run(llvm::Module&) + 33
16 llc             0x0000000000ab642d main + 2403
17 libc.so.6       0x00002aaaab262304 __libc_start_main + 244
18 llc             0x0000000000ab4b79
Stack dump:
0.  Program arguments: ../llc temp.ll -march=x86-64 -mattr=-sse2,-sse41 -o
/dev/null
1.  Running pass 'Function Pass Manager' on module 'temp.ll'.
2.  Running pass 'X86 FP Stackifier' on function '@autogen_9448_500'
; ModuleID = 'bugpoint-reduced-simplified.bc'
target triple = "x86_64-unknown-linux-gnu"

define void @autogen_9448_500(i8*, i32*, i64*, i32, i8, i64) {
BB:
  %A4 = alloca <8 x i64>
  %A3 = alloca <8 x i8>
  %A2 = alloca <8 x i8>
  %A1 = alloca <8 x i32>
  %A = alloca <8 x double>
  %L = load i8* %0
  store i8 -89, i8* %0
  %E = extractelement <8 x i32> undef, i32 7
  %E5 = extractelement <32 x float> undef, i32 12
  %Shuff = shufflevector <8 x double> undef, <8 x double> undef, <8 x i32> <i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 undef, i32 undef>
  %Shuff6 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 4, i32 6, i32 8, i32 10, i32 12, i32 undef, i32 0, i32 undef>
  %Shuff7 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 6, i32 8, i32 10, i32 12, i32 14, i32 0, i32 2, i32 4>
  %I = insertelement <8 x double> %Shuff, double 0x3EC3137F3F602019, i32 0
  %B = mul i8 %L, -1
  %FC = fptosi double 0x3EC3137F3F602019 to i16
  %S = select i1 true, i16 %FC, i16 %FC
  %S8 = fcmp olt float %E5, 0x3EC6F41020000000
  br label %CF42

CF42:                                             ; preds = %BB
  %L9 = load i8* %0
  store i8 %L, i8* %0
  %E10 = extractelement <8 x double> %Shuff, i32 6
  %E11 = extractelement <8 x double> %Shuff, i32 0
  %Shuff12 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 3, i32 5, i32 undef, i32 9, i32 11, i32 undef, i32 15, i32 1>
  %Shuff13 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 undef, i32 undef, i32 9, i32 11, i32 13, i32 15, i32 undef, i32 3>
  %Shuff14 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff13, <8 x i32> <i32 7, i32 undef, i32 11, i32 13, i32 15, i32 1, i32 undef, i32 5>
  %I15 = insertelement <8 x double> %Shuff12, double 0x3ED4B999595A3E38, i32 1
  %B16 = frem <8 x double> %Shuff13, %Shuff
  %BC = bitcast float 0x3ECCE7D580000000 to i32
  %S17 = select i1 true, i64 -1, i64 %5
  %S18 = icmp ule i64 508019, %S17
  br label %CF41

CF41:                                             ; preds = %CF41, %CF42
  %L19 = load i8* %0
  store i8 -1, i8* %0
  %E20 = extractelement <8 x double> %Shuff, i32 1
  %E21 = extractelement <8 x double> %Shuff, i32 3
  %Shuff22 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 undef, i32 undef, i32 2, i32 4, i32 6, i32 8, i32 10, i32 undef>
  %Shuff23 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 0, i32 2, i32 4, i32 undef, i32 8, i32 undef, i32 undef, i32 14>
  %Shuff24 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff12, <8 x i32> <i32 2, i32 4, i32 6, i32 undef, i32 10, i32 12, i32 undef, i32 0>
  %I25 = insertelement <8 x double> %Shuff, double %E10, i32 4
  %B26 = fdiv <8 x double> %B16, %Shuff
  %FC27 = sitofp i32 338935 to float
  %S28 = select i1 false, i32* %1, i32* %1
  %S29 = icmp slt i8 -1, %B
  br i1 %S29, label %CF41, label %CF43

CF43:                                             ; preds = %CF41
  %L30 = load i32* %S28
  store i32 413419, i32* %S28
  %E31 = extractelement <8 x double> %Shuff6, i32 2
  %E32 = extractelement <8 x double> %Shuff, i32 4
  %Shuff33 = shufflevector <8 x double> %Shuff12, <8 x double> %Shuff, <8 x i32> <i32 7, i32 9, i32 11, i32 undef, i32 15, i32 1, i32 3, i32 5>
  %Shuff34 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 9, i32 undef, i32 13, i32 undef, i32 1, i32 3, i32 5, i32 7>
  %Shuff35 = shufflevector <8 x double> %Shuff, <8 x double> %Shuff, <8 x i32> <i32 11, i32 13, i32 15, i32 1, i32 3, i32 5, i32 7, i32 undef>
  %I36 = insertelement <8 x double> %Shuff, double %E10, i32 5
  %B37 = fadd <8 x double> %Shuff14, %Shuff
  %FC38 = uitofp i8 -89 to double
  %S39 = select i1 true, <8 x double>* %A, <8 x double>* %A
  %S40 = icmp slt i32 413419, 0
  br label %CF

CF:                                               ; preds = %CF43
  store <8 x double> %Shuff12, <8 x double>* %A
  store i32 %3, i32* %S28
  store i32 %3, i32* %S28
  store <8 x double> %Shuff, <8 x double>* %A
  store <8 x double> %Shuff33, <8 x double>* %S39
  ret void
}
Quuxplusone commented 13 years ago

These are great bugs, but please put the test-case in an attachment so it won't be mangled by Bugzilla.

Quuxplusone commented 13 years ago
Okay.  I will also clean-up my random test generator and add it to llvm/util.

(In reply to comment #1)
> These are great bugs, but please put the test-case in an attachment so it
won't
> be mangled by Bugzilla.
Quuxplusone commented 13 years ago
I am not sure if this test case is valid. It is trying to call a function that
takes arguments in %xmm0 and %xmm1, but sse2 is disabled. We keep our floats in
x87 registers, and try to set up the call like this:

        %XMM0<def> = COPY %vreg2; RFP64:%vreg2
        %XMM1<def> = COPY %vreg3; RFP64:%vreg3
        CALL64pcrel32 <es:fmod>, %XMM0, %XMM1, %RCX<imp-def,dead>,

Eli, what do you think? Should this work with -mattr=-sse2,-sse41?
Quuxplusone commented 13 years ago
The original IR doesn't contain any call instructions, it only makes use of
generic LLVM IR operations.  So I think the testcase is valid.
Quuxplusone commented 13 years ago
(In reply to comment #4)
> The original IR doesn't contain any call instructions, it only makes use of
> generic LLVM IR operations.  So I think the testcase is valid.

Might be "valid", but we have no sane way to generate frem on x86-64 with sse
forced off.  Granted, there is a missing report_fatal_error call in
X86TargetLowering::LowerCall which would make that much more clear.
Quuxplusone commented 7 years ago

Maybe this can help: https://reviews.llvm.org/D27522.

Quuxplusone commented 6 years ago

Attached x86-floating-point-crash.ll (2563 bytes, text/plain): Reduced test case

Quuxplusone commented 5 years ago

"+sse,-sse2"? How did the function get a feature list like that? If clang is generating that, I'd consider it a bug in clang. Granted, we should still print a reasonable error from the backend.

Quuxplusone commented 5 years ago
(In reply to Eli Friedman from comment #8)
> "+sse,-sse2"?  How did the function get a feature list like that?  If clang
> is generating that, I'd consider it a bug in clang.  Granted, we should
> still print a reasonable error from the backend.

It's the -mno-sse2 flag.  Here is a simple reproducer:

echo "double add(double a, double b) {return a + b;}" | clang -mno-sse2 -c -x c
-
Quuxplusone commented 5 years ago

The backend does raise an error in X86TargetLowering::LowerReturn() by calling errorUnsupported(DAG, dl, "SSE2 register return with SSE2 disabled");

But then the backend asserts in a later pass before clang can handle the error.

Quuxplusone commented 5 years ago
Hello friends!  I hit precisely this assertion today debugging crash when
compiling the Linux kernel w/ Clang.  The kernel disables -msse (and all newer
x86 ISA extensions) as it would otherwise have to save/restore FP registers.
It does allow for limited use of -msse to few drivers, but Clang crashes
generating code for them.

$ cat foo.i
double a() { int b = a(); }
$ clang -mno-sse2 foo.i
foo.i:1:27: warning: control reaches end of non-void function [-Wreturn-type]
double a() { int b = a(); }
                          ^
foo.i:1:8: error: SSE2 register return with SSE2 disabled
double a() { int b = a(); }
       ^
clang-9: ../lib/Target/X86/X86FloatingPoint.cpp:317: unsigned int
getFPReg(const llvm::MachineOperand &): Assertion `Reg >= X86::FP0 && Reg <=
X86::FP6 && "Expected FP register!"' failed.
Stack dump:
0.  Program arguments: /android0/llvm-project/llvm/build/bin/clang-9 -cc1 -
triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-
name foo.i -mrelocation-model static -mthread-model posix -mframe-pointer=all -
fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-
array -target-cpu x86-64 -target-feature -sse2 -dwarf-column-info -debugger-
tuning=gdb -resource-dir /android0/llvm-project/llvm/build/lib/clang/10.0.0 -
fdebug-compilation-dir /android0/linux-next -ferror-limit 19 -fmessage-length 0
-fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -faddrsig -o
/tmp/foo-ecbd3a.o -x cpp-output foo.i
1.  <eof> parser at end of file
2.  Code generation
3.  Running pass 'Function Pass Manager' on module 'foo.i'.
4.  Running pass 'X86 FP Stackifier' on function '@a'
 #0 0x0000000006341379 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:532:11
 #1 0x0000000006341529 PrintStackTraceSignalHandler(void*) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:593:1
 #2 0x000000000633fdf6 llvm::sys::RunSignalHandlers() /android0/llvm-project/llvm/build/../lib/Support/Signals.cpp:67:5
 #3 0x0000000006341c8b SignalHandler(int) /android0/llvm-project/llvm/build/../lib/Support/Unix/Signals.inc:384:1
 #4 0x00007f31035a33a0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x123a0)
 #5 0x00007f3102631cfb raise (/lib/x86_64-linux-gnu/libc.so.6+0x36cfb)
 #6 0x00007f310261c8ad abort (/lib/x86_64-linux-gnu/libc.so.6+0x218ad)
 #7 0x00007f310261c77f (/lib/x86_64-linux-gnu/libc.so.6+0x2177f)
 #8 0x00007f310262a542 (/lib/x86_64-linux-gnu/libc.so.6+0x2f542)
 #9 0x0000000004df3f23 getFPReg(llvm::MachineOperand const&) /android0/llvm-project/llvm/build/../lib/Target/X86/X86FloatingPoint.cpp:318:10
#10 0x0000000004df5937 (anonymous
namespace)::FPS::handleSpecialFP(llvm::MachineInstrBundleIterator<llvm::MachineInstr,
false>&) /android0/llvm-
project/llvm/build/../lib/Target/X86/X86FloatingPoint.cpp:1460:14
#11 0x0000000004df33c3 (anonymous
namespace)::FPS::processBasicBlock(llvm::MachineFunction&,
llvm::MachineBasicBlock&) /android0/llvm-
project/llvm/build/../lib/Target/X86/X86FloatingPoint.cpp:461:49
#12 0x0000000004df292a (anonymous
namespace)::FPS::runOnMachineFunction(llvm::MachineFunction&) /android0/llvm-
project/llvm/build/../lib/Target/X86/X86FloatingPoint.cpp:374:16
#13 0x000000000544cbb7
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /android0/llvm-
project/llvm/build/../lib/CodeGen/MachineFunctionPass.cpp:73:8
#14 0x0000000005a14559 llvm::FPPassManager::runOnFunction(llvm::Function&)
/android0/llvm-project/llvm/build/../lib/IR/LegacyPassManager.cpp:1648:23
#15 0x0000000005a1499f llvm::FPPassManager::runOnModule(llvm::Module&)
/android0/llvm-project/llvm/build/../lib/IR/LegacyPassManager.cpp:1685:16
#16 0x0000000005a15118 (anonymous
namespace)::MPPassManager::runOnModule(llvm::Module&) /android0/llvm-
project/llvm/build/../lib/IR/LegacyPassManager.cpp:1750:23
#17 0x0000000005a14c45 llvm::legacy::PassManagerImpl::run(llvm::Module&)
/android0/llvm-project/llvm/build/../lib/IR/LegacyPassManager.cpp:1863:16
#18 0x0000000005a156a1 llvm::legacy::PassManager::run(llvm::Module&)
/android0/llvm-project/llvm/build/../lib/IR/LegacyPassManager.cpp:1894:3
#19 0x000000000669c9ca (anonymous
namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream> >) /android0/llvm-
project/clang/lib/CodeGen/BackendUtil.cpp:914:3
#20 0x0000000006698c3c clang::EmitBackendOutput(clang::DiagnosticsEngine&,
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&,
clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout
const&, llvm::Module*, clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream> >) /android0/llvm-
project/clang/lib/CodeGen/BackendUtil.cpp:1533:5
#21 0x0000000007294788
clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&)
/android0/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:307:7
#22 0x000000000931765c clang::ParseAST(clang::Sema&, bool, bool) /android0/llvm-
project/clang/lib/Parse/ParseAST.cpp:178:12
#23 0x00000000070ed052 clang::ASTFrontendAction::ExecuteAction() /android0/llvm-
project/clang/lib/Frontend/FrontendAction.cpp:1044:1
#24 0x0000000007291c9c clang::CodeGenAction::ExecuteAction() /android0/llvm-
project/clang/lib/CodeGen/CodeGenAction.cpp:1089:1
#25 0x00000000070eca01 clang::FrontendAction::Execute() /android0/llvm-
project/clang/lib/Frontend/FrontendAction.cpp:939:7
#26 0x000000000701e37a
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /android0/llvm-
project/clang/lib/Frontend/CompilerInstance.cpp:957:23
#27 0x000000000727c19f
clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /android0/llvm-
project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:290:8
#28 0x00000000043a668e cc1_main(llvm::ArrayRef<char const*>, char const*,
void*) /android0/llvm-project/clang/tools/driver/cc1_main.cpp:250:13
#29 0x000000000439a48f ExecuteCC1Tool(llvm::ArrayRef<char const*>,
llvm::StringRef) /android0/llvm-project/clang/tools/driver/driver.cpp:309:5
#30 0x0000000004399832 main /android0/llvm-
project/clang/tools/driver/driver.cpp:382:5
#31 0x00007f310261e52b __libc_start_main (/lib/x86_64-linux-
gnu/libc.so.6+0x2352b)
#32 0x000000000439902a _start (/android0/llvm-project/llvm/build/bin/clang-
9+0x439902a)
clang-9: error: unable to execute command: Aborted
clang-9: error: clang frontend command failed due to signal (use -v to see
invocation)
clang version 10.0.0 (https://github.com/llvm/llvm-project.git
5806022904bc447525a02cff796c9bbbd02b0444)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /android0/llvm-project/llvm/build/bin
clang-9: note: diagnostic msg: PLEASE submit a bug report to
https://bugs.llvm.org/ and include the crash backtrace, preprocessed source,
and associated run script.
clang-9: note: diagnostic msg: Error generating preprocessed source(s) - no
preprocessable inputs.
Quuxplusone commented 5 years ago

Some observations from a debugger:

llvm/lib/Target/X86/X86FloatingPoint.cpp#FPS::processBasicBlock() calls llvm/lib/Target/X86/X86FloatingPoint.cpp#FPS::handleSpecialFP() calls llvm/lib/Target/X86/X86FloatingPoint.cpp#getFPReg() for the MachineInstruction:

renamable $fp0 = COPY killed $xmm0

AFAICT, this instruction exists as soon as the LLVM bitcode is invoked with llc -print-machineinstrs.

FPS::processBasicBlock() has a bunch of special cases that set

FPInstClass = X86II::SpecialFP;

maybe one of those is wrong?

Specifically,

if (MI.isCopy() && isFPCopy(MI))
  FPInstClass = X86II::SpecialFP;

is true.

Maybe the assertion everyone is hitting should be relaxed for all FP regs added via ISA extensions, or the COPY $xmm0 should never be generated in the first place?

Quuxplusone commented 5 years ago

So llvm/lib/Target/X86/X86FloatingPoint.cpp#FPS::runOnMachineFunction() is a curious case. It has a check that FP regs are used:

  // We only need to run this pass if there are any FP registers used in this
  // function.  If it is all integer, there is nothing for us to do!
  bool FPIsUsed = false;

  static_assert(X86::FP6 == X86::FP0+6, "Register enums aren't sorted right!");
  const MachineRegisterInfo &MRI = MF.getRegInfo();
  for (unsigned i = 0; i <= 6; ++i)
    if (!MRI.reg_nodbg_empty(X86::FP0 + i)) {
      FPIsUsed = true;
      break;
    }

  // Early exit.
  if (!FPIsUsed) return false;

For the first loop iteration X86::FP0 is "used". From the MIR dump, I observe "Virtual Register Rewriter" rewriting:

64B %1:rfp64 = COPY killed $xmm0

into

64B renamable $fp0 = COPY killed $xmm0

which later leads to the assertion failure.

Not that I know anything about MIR, but should the VRR rewrite rfp64 into another $xmm register (based on the instruction's operand) or should %rfp64 virtual operator ever be emitted for a COPY $xmm0 by whatever generates MIR? Also, it seems odd to have an explict copy of a physical register so early on when the MIR seems to be otherwise composed of nothing but virtual registers.

Quuxplusone commented 5 years ago

A copy from $xmm0 to an x87 register should never be generated. There are no instructions that can directly make the copy. The explicit reference to the xmm0 register is to match the ABI for the return from the call. The intention normally is that the xmm0 register will be copied to a virtual register of FR64 class which will give freedom back to the register allocator. Since SSE2 is disabled the register class assigned to f64 is RFP64 instead of FR64. So we generate a bogus call. I have a patch to stop the assertion, but it will generate an error instead. Removing the error is going to be harder and I'm not sure how to do it exactly.

Quuxplusone commented 5 years ago

Craig fixed up the compiler crash from my test case in comment #11 in r372197.

Here are a few more simple test cases that also lead to compiler crashes at -mno-sse2: https://godbolt.org/z/HUwLYg