llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.64k stars 11.37k forks source link

Assertion failed TwoAddressInstructionPass.cpp:1492: `i == DstIdx || !MI->getOperand(i).isReg() || MI->getOperand(i).getReg() != RegA'` #54957

Open JimWen opened 2 years ago

JimWen commented 2 years ago

Error

Build clang with g++ and then build clickhouse with clang 1.14.1, but it always crash here

/usr/local/bin/clang -DSTD_EXCEPTION_HAS_STACK_TRACE=1 -I/home/wenzhou/clickhouse/ClickHouse/contrib/sentry-native/include -I/home/wenzhou/clickhouse/ClickHouse/contrib/aws-checksums/include -I/home/wenzhou/clickhouse/ClickHouse/base/glibc-compatibility/memcpy -isystem /home/wenzhou/clickhouse/ClickHouse/contrib/libcxx/include -isystem /home/wenzhou/clickhouse/ClickHouse/contrib/libcxxabi/include -isystem /home/wenzhou/clickhouse/ClickHouse/contrib/libunwind/include -isystem /home/wenzhou/clickhouse/ClickHouse/contrib/libc-headers/x86_64-linux-gnu -isystem /home/wenzhou/clickhouse/ClickHouse/contrib/libc-headers -fdiagnostics-color=always  -gdwarf-aranges -pipe -mssse3 -msse4.1 -msse4.2 -mpclmul -mpopcnt -fasynchronous-unwind-tables -falign-functions=32  -Wall -Wno-unused-command-line-argument  -fdiagnostics-absolute-paths -fexperimental-new-pass-manager -Werror -w -O2 -g -DNDEBUG -O3  -fno-pie   -D OS_LINUX -g0 -std=gnu99 -MD -MT contrib/aws-s3-cmake/CMakeFiles/aws_s3_checksums.dir/__/aws-checksums/source/intel/crc32c_sse42_asm.c.o -MF contrib/aws-s3-cmake/CMakeFiles/aws_s3_checksums.dir/__/aws-checksums/source/intel/crc32c_sse42_asm.c.o.d -o contrib/aws-s3-cmake/CMakeFiles/aws_s3_checksums.dir/__/aws-checksums/source/intel/crc32c_sse42_asm.c.o -c /home/wenzhou/clickhouse/ClickHouse/contrib/aws-checksums/source/intel/crc32c_sse42_asm.c
clang: /home/wenzhou/llvm-project-llvmorg-11.1.0/llvm/lib/CodeGen/TwoAddressInstructionPass.cpp:1394: void {anonymous}::TwoAddressInstructionPass::processTiedPairs(llvm::MachineInstr*, {anonymous}::TwoAddressInstructionPass::TiedPairList&, unsigned int&): Assertion `i == DstIdx || !MI->getOperand(i).isReg() || MI->getOperand(i).getReg() != RegA' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.

More info , please refer this issue https://github.com/ClickHouse/ClickHouse/issues/36347. Do i get something wrong or it's just a bug of llvm.

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-codegen

JimWen commented 2 years ago

crc32c_sse42_asm.zip

add some related report file

fhahn commented 2 years ago

Looks like the assert is triggered by some inline assembly. Reduced reproducer with llc: https://llvm.godbolt.org/z/GdqxEaMnc

define i32 @aws_checksums_crc32c_hw() {
while.cond.preheader:
  br label %while.body

while.body:                                       ; preds = %while.body, %while.cond.preheader
  %crc.0158 = phi i32 [ %0, %while.body ], [ 0, %while.cond.preheader ]
  %0 = tail call i32 asm "loop_small_${:uid}: CRC32B ($2), $1", "={cx},{cx},r,0,~{dirflag},~{fpsr},~{flags}"(i32 %crc.0158, i8* null, i32 0)
  br label %while.body
}
JimWen commented 2 years ago

same with these issue https://github.com/llvm/llvm-project/issues/40330 https://github.com/llvm/llvm-project/issues/39643

cause by the asm in (https://github.com/awslabs/aws-checksums) asm("crc32b (%[in]), %[crc]" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input));

c code to reproduce as following compile with clang -S crash.c -o - trigger this bug and compile with gcc -O2 -S crash.c -o - ok

#include <stdio.h>

int main() {
return -1;
}

unsigned f(const unsigned char *input)
{
unsigned crc = ~0;
asm("crc32b (%[in]), %[crc]" : "+c"(crc) : [crc] "c"(crc), [in] "r"(input));
return ~crc;
}

as to project https://github.com/awslabs/aws-checksums comile with cmake .. trriger this bug and compile with cmake .. -DCMAKE_BUILD_TYPE:STRING=Debug is ok

the asm inside does not match SSA form, is this the main cause ?

fhahn commented 2 years ago

the asm inside does not match SSA form, is this the main cause ?

It looks X86 inline assembly related. Given that GCC accepts the asm, there's probably a problem with how the asm is mapped in the backend.

cc'ing some people who may be more familiar @nickdesaulniers @topperc @phoebewang

DimitryAndric commented 2 years ago

Yes, this is a very old one, and has never been solved. :(

JimWen commented 2 years ago

Yes, this is a very old one, and has never been solved. :(

well, i recompile clang with the assert here turned off and compile crash.c with clang vs gcc, the asm compiled compare as followings

clang -O2 -S crash.c -o -

    .cfi_startproc
# %bb.0:                                # %entry
    movl    $-1, %ecx
    movl    $-1, %ecx
    #APP
    crc32b  (%rdi), %ecx
    #NO_APP
    movl    %ecx, %eax
    notl    %eax
    retq
.Lfunc_end1:
    .size   f, .Lfunc_end1-f
    .cfi_endproc

gcc -O2 -S crash.c -o -

.LFB12:
    .cfi_startproc
    movl    $-1, %ecx
#APP
# 10 "crash.c" 1
    crc32b (%rdi), %ecx
# 0 "" 2
#NO_APP
    movl    %ecx, %eax
    notl    %eax
    ret
    .cfi_endproc
.LFE12:
    .size   f, .-f

so the only diff is clang do movl $-1, %ecx twice may be this is the main cause with this assert?

DimitryAndric commented 2 years ago

Yes, this is precisely the difference I reported in https://github.com/llvm/llvm-project/issues/40330#issuecomment-981010129, for which the explanation is yet unknown. That said, there is no harm in doing the load twice, it is just wasteful.

JimWen commented 2 years ago

i see so it seems just acceptable for the situation here(compiling clickhouse), it‘s ok to just turn off this assert may be it's wrong in some other situation

phoebewang commented 2 years ago

FWIW, gcc fails on O0. https://godbolt.org/z/ePPxvs1T3

topperc commented 2 years ago

The inline assembly lists ecx as a source twice. Maybe gcc recognizes this and fixes it. If you change the first constraint from "+c" to "=c" it doesn't fail on clang.

fhahn commented 2 years ago

The inline assembly lists ecx as a source twice. Maybe gcc recognizes this and fixes it. If you change the first constraint from "+c" to "=c" it doesn't fail on clang.

Should Clang reject the inline assembly and present a proper error message?

DimitryAndric commented 2 years ago

@fhahn anything is better than asserting or silently producing possibly bad code. :)

fhahn commented 2 years ago

I'll mark this as a X86 backend problem for now, as I think the backend has to reject the inline assembly.

llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-x86