llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.59k stars 11.82k forks source link

[Backend][LLC][Matrix][Intrinsics] Do not know how to scalarize the result of this operator #44572

Open nicolasvasilache opened 4 years ago

nicolasvasilache commented 4 years ago
Bugzilla Link 45227
Version trunk
OS All
Attachments Repro for matrix instrinsics scalarize bug.
CC @fhahn,@yowl

Extended Description

clang scalarize.ll

warning: overriding the module target triple with x86_64-grtev4-linux-gnu [-Woverride-module]
ScalarizeVectorResult #&#8203;0: t9: v1f32 = llvm.matrix.multiply TargetConstant:i64<171>, t4, t7, TargetConstant:i32<1>, TargetConstant:i32<1>, TargetConstant:i32<1>, <stdin>:5:10

fatal error: error in backend: Do not know how to scalarize the result of this operator!

clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version google3-trunk (378b1e60809df7cc72436093f160b7ed228dad5a)
Target: x86_64-grtev4-linux-gnu
Thread model: posix
fhahn commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#45229

fhahn commented 4 years ago

I'm getting the same message with wasm-ld might it be the same or shall I open a new ticket?

The error message is the same, but it fails with legalizing a different operation, so it is unrelated to the matrix intrinsics. The bug-report here is only related to lowering of @llvm.matrix.* calls.

Please file a new issue for the crash.

yowl commented 4 years ago

I tried with debug version from commit 09/08/2020 10:51:19 23817cbd0b6549d6145e4d0dbc0162370184a21e and the problem did not occur. Version of emscripten was 2.0.1 . If 2.0.1 is from before this commit, then could be that my instance of this error at least is already fixed.

yowl commented 4 years ago

The llvm in the comment

yowl commented 4 years ago

I'm getting the same message with wasm-ld might it be the same or shall I open a new ticket?

E:\GitHub\corert>"E:\GitHub\emsdk\upstream\emscripten\emcc.bat" "E:\GitHub\corert\tests\src\Simple\HelloWasm\obj\Debug\wasm\native\HelloWasm.o" -o "E:\GitHub\corert\tests\src\Simple\HelloWasm\bin\Debug\wasm\native\HelloWasm.html" -s WASM=1 -s ERROR_ON_UNDEFINED_SYMBOLS=0 -s DISABLE_EXCEPTION_CATCHING=0 -s WASM_MEM_MAX=200Mb --emrun "E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libPortableRuntime.a" "E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libbootstrappercpp.a" "E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libSystem.Private.CoreLib.Native.a" -g3 -s INITIAL_MEMORY=167772160 --js-library E:\GitHub\corert\tests\src\Simple\HelloWasm\dotnet_support.js --pre-js E:\GitHub\corert\tests\src\Simple\HelloWasm\Microsoft.JSInterop.js --post-js E:\GitHub\corert\tests\src\Simple\HelloWasm\HelloWasm.js -g ScalarizeVectorResult #​0: t168: v1i32 = rotl t166, t132, BitOperations.cs:366 @[ HashCode.cs:268 @[ HashCode.cs:158 ] ]

LLVM ERROR: Do not know how to scalarize the result of this operator!

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump:

  1. Program arguments: E:/GitHub/emsdk/upstream/bin\wasm-ld.exe -o C:\Users\SCOTTW~1\AppData\Local\Temp\emscripten_temp_2lw0yg80\HelloWasm.wasm E:\GitHub\corert\tests\src\Simple\HelloWasm\obj\Debug\wasm\native\HelloWasm.o E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libPortableRuntime.a E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libbootstrappercpp.a -LE:\GitHub\emsdk\upstream\emscripten\system\local\lib E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libSystem.Private.CoreLib.Native.a -LE:\GitHub\emsdk\upstream\emscripten\system\lib -LE:\GitHub\emsdk\upstream\emscripten\cache\wasm E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libcompiler_rt.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc++.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc++abi.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libdlmalloc.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libpthread_stub.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc_rt_wasm.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libsockets.a -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --allow-undefined --import-memory --import-table --export main --export malloc --export free --export stackSave --export stackRestore --export stackAlloc --export data_end --export wasm_call_ctors --export fflush --export errno_location --export _ZSt18uncaught_exceptionv --export cxa_find_matching_catch --export __cxa_is_pointer_type --export __cxa_can_catch --export setThrew --export memalign --export memset --export emscripten_main_thread_process_queued_calls -z stack-size=5242880 --initial-memory=167772160 --no-entry --max-memory=167772160 --global-base=1024
  2. Running pass 'Function Pass Manager' on module 'ld-temp.o'.
  3. Running pass 'WebAssembly Instruction Selection' on function '@"S_P_CoreLib_System_HashCode__Combine_3"'

    ​0 0x00007ff61b761036 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x31036)

    ​1 0x00007ff9976d1861 (C:\WINDOWS\System32\ucrtbase.dll+0x71861)

    ​2 0x00007ff9976d2831 (C:\WINDOWS\System32\ucrtbase.dll+0x72831)

    ​3 0x00007ff61b7639c2 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x339c2)

    ​4 0x00007ff61b7637c7 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x337c7)

    ​5 0x00007ff61c3b6262 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xc86262)

    ​6 0x00007ff61c30637f (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xbd637f)

    ​7 0x00007ff61c30c83a (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xbdc83a)

    ​8 0x00007ff61c1455af (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xa155af)

    ​9 0x00007ff61c144d65 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xa14d65)

    ​10 0x00007ff61c144594 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xa14594)

    ​11 0x00007ff61c1401ec (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xa101ec)

    ​12 0x00007ff61bfd6990 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x8a6990)

    ​13 0x00007ff61c4499be (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xd199be)

    ​14 0x00007ff61d5a2029 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x1e72029)

    ​15 0x00007ff61d5a8b13 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x1e78b13)

    ​16 0x00007ff61d5a276d (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x1e7276d)

    ​17 0x00007ff61c422353 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xcf2353)

    ​18 0x00007ff61c41f2be (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xcef2be)

    ​19 0x00007ff61c416512 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xce6512)

    ​20 0x00007ff61c415a09 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0xce5a09)

    ​21 0x00007ff61ba7b95c (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x34b95c)

    ​22 0x00007ff61ba53439 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x323439)

    ​23 0x00007ff61ba5009a (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x32009a)

    ​24 0x00007ff61ba4b3f3 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x31b3f3)

    ​25 0x00007ff61b731746 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x1746)

    ​26 0x00007ff61d76c828 (E:\GitHub\emsdk\upstream\bin\wasm-ld.exe+0x203c828)

    ​27 0x00007ff998636fd4 (C:\WINDOWS\System32\KERNEL32.DLL+0x16fd4)

    ​28 0x00007ff999c3cec1 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x4cec1)

    emcc: error: 'E:/GitHub/emsdk/upstream/bin\wasm-ld.exe -o C:\Users\SCOTTW~1\AppData\Local\Temp\emscripten_temp_2lw0yg80\HelloWasm.wasm E:\GitHub\corert\tests\src\Simple\HelloWasm\obj\Debug\wasm\native\HelloWasm.o E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libPortableRuntime.a E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libbootstrappercpp.a -LE:\GitHub\emsdk\upstream\emscripten\system\local\lib E:\GitHub\corert\tests\..\bin\WebAssembly.wasm.Debug/sdk/libSystem.Private.CoreLib.Native.a -LE:\GitHub\emsdk\upstream\emscripten\system\lib -LE:\GitHub\emsdk\upstream\emscripten\cache\wasm E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libcompiler_rt.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc++.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc++abi.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libdlmalloc.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libpthread_stub.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libc_rt_wasm.a E:\GitHub\emsdk\upstream\emscripten\cache\wasm\libsockets.a -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-cxx-exceptions -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --allow-undefined --import-memory --import-table --export main --export malloc --export free --export stackSave --export stackRestore --export stackAlloc --export data_end --export wasm_call_ctors --export fflush --export errno_location --export _ZSt18uncaught_exceptionv --export cxa_find_matching_catch --export __cxa_is_pointer_type --export __cxa_can_catch --export setThrew --export memalign --export memset --export emscripten_main_thread_process_queued_calls -z stack-size=5242880 --initial-memory=167772160 --no-entry --max-memory=167772160 --global-base=1024' failed (3221225501)

The function in question is below. It's part of a large bitcode. I can try to remove all the dependencies and shrink the method if that helps. It's a bit odd as this llvm IR text comes from saving the text as well as the bitcode and the bitcode itself links ok. This is done using libLLVM. So its like the process of saving the text or going from the text to the bitcode is not consistent with the original bitcode created using libLLVM, maybe that's not a surprise I don't know.

define i32 @"S_P_CoreLib_System_HashCode__Combine_3"(i8, i32, i32, i32, i32, i32) !dbg !​588632 { Prolog: %value1arg0 = alloca i32 store i32 %1, i32 %value1arg0 %value2arg1 = alloca i32 store i32 %2, i32 %value2arg1 %value3arg2 = alloca i32 store i32 %3, i32 %value3arg2 %value4arg3 = alloca i32 store i32 %4, i32 %value4arg3 %value5arg4 = alloca i32 store i32 %5, i32 %value5arg4 %hc1local0 = alloca i32 %hc2local1 = alloca i32 %hc3local2 = alloca i32 %hc4local3 = alloca i32 %hc5local4 = alloca i32 %v1local5 = alloca i32 %v2local6 = alloca i32 %v3local7 = alloca i32 %v4local8 = alloca i32 %hashlocal9 = alloca i32 %local10 = alloca i32 %Temp0 = getelementptr i8, i8 %0, i32 0 %Temp1_ = getelementptr i8, i8 %0, i32 4 %Temp2 = getelementptr i8, i8* %0, i32 8 %Temp3 = getelementptr i8, i8 %0, i32 12 %Temp4_ = getelementptr i8, i8 %0, i32 16 %Temp5 = getelementptr i8, i8* %0, i32 20 %Temp6 = getelementptr i8, i8 %0, i32 24 %Temp7_ = getelementptr i8, i8 %0, i32 28 %Temp8 = getelementptr i8, i8* %0, i32 32 %Temp9 = getelementptr i8, i8* %0, i32 36 br label %Block0

Block0: ; preds = %Prolog call void @​llvm.donothing(), !dbg !​588633 %6 = getelementptr i8, i8 %0, i32 0, !dbg !​588634 %LoadeeType = load %"[S.P.CoreLib]Internal.Runtime.EEType", %"[S.P.CoreLib]Internal.Runtime.EEType" bitcast (i32 @​EETypeInt32SYMBOL to %"[S.P.CoreLib]Internal.Runtime.EEType"), !dbg !​588634 %CastPtraddress_of = bitcast i32 %value1arg0 to i8, !dbg !​588634 %7 = getelementptr i8, i8 %6, i32 0, !dbg !​588634 %CastPtrTypedStore = bitcast i8 %7 to i8, !dbg !​588634 store i8* %CastPtraddress_of, i8 %CastPtrTypedStore, !dbg !​588634 call void @​S_P_CoreLib_System_Runtime_RuntimeExports_RhBox(i8 %6, i8 %Temp0, %"[S.P.CoreLib]Internal.Runtime.EEType" %LoadeeType), !dbg !​588634 %CastPtrTemp0_ = bitcast i8 %Temp0_ to i8, !dbg !​588634 %LdTemp0 = load i8*, i8** %CastPtrTemp0, !dbg !​588634 %CastInt = ptrtoint i8* %LdTemp0_ to i32, !dbg !​588634 %brtrue = icmp ne i32 %CastInt, 0, !dbg !​588634 br i1 %brtrue, label %BlockC, label %Block9, !dbg !​588634

BlockC: ; preds = %Block0 %8 = getelementptr i8, i8 %0, i32 40, !dbg !​588634 %CastPtrldloca47 = bitcast i32 %value1arg0 to i8, !dbg !​588634 %9 = getelementptr i8, i8 %8, i32 0, !dbg !​588634 %CastPtrTypedStore48 = bitcast i8* %9 to i8*, !dbg !​588634 store i8 %CastPtrldloca47, i8* %CastPtrTypedStore48, !dbg !​588634 %10 = call i32 @​Int32__GetHashCode(i8 %8), !dbg !​588634 %CastPtrTemp1int3249 = bitcast i8* %Temp1 to i32 store i32 %10, i32 %CastPtrTemp1_int3249 br label %Block19

Block9: ; preds = %Block0 %CastPtrTemp1int32 = bitcast i8* %Temp1 to i32, !dbg !​588634 store i32 0, i32 %CastPtrTemp1_int32, !dbg !​588634 br label %Block19, !dbg !​588634

Block19: ; preds = %BlockC, %Block9 %CastPtrTemp1 = bitcast i8* %Temp1 to i32, !dbg !​588634 %LdTemp1_ = load i32, i32 %CastPtrTemp1, !dbg !​588634 store i32 %LdTemp1, i32 %hc1local0, !dbg !​588634 %11 = getelementptr i8, i8 %0, i32 8, !dbg !​588635 %LoadeeType1 = load %"[S.P.CoreLib]Internal.Runtime.EEType"*, %"[S.P.CoreLib]Internal.Runtime.EEType" bitcast (i32 @​EETypeInt32SYMBOL to %"[S.P.CoreLib]Internal.Runtime.EEType"), !dbg !​588635 %CastPtraddress_of2 = bitcast i32 %value2arg1 to i8, !dbg !​588635 %12 = getelementptr i8, i8 %11, i32 0, !dbg !​588635 %CastPtrTypedStore3 = bitcast i8 %12 to i8, !dbg !​588635 store i8* %CastPtraddress_of2, i8 %CastPtrTypedStore3, !dbg !​588635 call void @​S_P_CoreLib_System_Runtime_RuntimeExports_RhBox(i8 %11, i8 %Temp2, %"[S.P.CoreLib]Internal.Runtime.EEType" %LoadeeType1), !dbg !​588635 %CastPtrTemp2_ = bitcast i8 %Temp2_ to i8, !dbg !​588635 %LdTemp2 = load i8*, i8** %CastPtrTemp2, !dbg !​588635 %CastInt4 = ptrtoint i8* %LdTemp2_ to i32, !dbg !​588635 %brtrue5 = icmp ne i32 %CastInt4, 0, !dbg !​588635 br i1 %brtrue5, label %Block25, label %Block22, !dbg !​588635

Block25: ; preds = %Block19 %13 = getelementptr i8, i8 %0, i32 40, !dbg !​588635 %CastPtrldloca44 = bitcast i32 %value2arg1 to i8, !dbg !​588635 %14 = getelementptr i8, i8 %13, i32 0, !dbg !​588635 %CastPtrTypedStore45 = bitcast i8* %14 to i8*, !dbg !​588635 store i8 %CastPtrldloca44, i8* %CastPtrTypedStore45, !dbg !​588635 %15 = call i32 @​Int32__GetHashCode(i8 %13), !dbg !​588635 %CastPtrTemp3int3246 = bitcast i8* %Temp3 to i32 store i32 %15, i32 %CastPtrTemp3_int3246 br label %Block32

Block22: ; preds = %Block19 %CastPtrTemp3int32 = bitcast i8* %Temp3 to i32, !dbg !​588635 store i32 0, i32 %CastPtrTemp3_int32, !dbg !​588635 br label %Block32, !dbg !​588635

Block32: ; preds = %Block25, %Block22 %CastPtrTemp3 = bitcast i8* %Temp3 to i32, !dbg !​588635 %LdTemp3_ = load i32, i32 %CastPtrTemp3, !dbg !​588635 store i32 %LdTemp3, i32 %hc2local1, !dbg !​588635 %16 = getelementptr i8, i8 %0, i32 16, !dbg !​588636 %LoadeeType6 = load %"[S.P.CoreLib]Internal.Runtime.EEType"*, %"[S.P.CoreLib]Internal.Runtime.EEType" bitcast (i32 @​EETypeInt32SYMBOL to %"[S.P.CoreLib]Internal.Runtime.EEType"), !dbg !​588636 %CastPtraddress_of7 = bitcast i32 %value3arg2 to i8, !dbg !​588636 %17 = getelementptr i8, i8 %16, i32 0, !dbg !​588636 %CastPtrTypedStore8 = bitcast i8 %17 to i8, !dbg !​588636 store i8* %CastPtraddress_of7, i8 %CastPtrTypedStore8, !dbg !​588636 call void @​S_P_CoreLib_System_Runtime_RuntimeExports_RhBox(i8 %16, i8 %Temp4, %"[S.P.CoreLib]Internal.Runtime.EEType" %LoadeeType6), !dbg !​588636 %CastPtrTemp4_ = bitcast i8 %Temp4_ to i8, !dbg !​588636 %LdTemp4 = load i8*, i8** %CastPtrTemp4, !dbg !​588636 %CastInt9 = ptrtoint i8* %LdTemp4_ to i32, !dbg !​588636 %brtrue10 = icmp ne i32 %CastInt9, 0, !dbg !​588636 br i1 %brtrue10, label %Block3E, label %Block3B, !dbg !​588636

Block3E: ; preds = %Block32 %18 = getelementptr i8, i8 %0, i32 40, !dbg !​588636 %CastPtrldloca41 = bitcast i32 %value3arg2 to i8, !dbg !​588636 %19 = getelementptr i8, i8 %18, i32 0, !dbg !​588636 %CastPtrTypedStore42 = bitcast i8* %19 to i8*, !dbg !​588636 store i8 %CastPtrldloca41, i8* %CastPtrTypedStore42, !dbg !​588636 %20 = call i32 @​Int32__GetHashCode(i8 %18), !dbg !​588636 %CastPtrTemp5int3243 = bitcast i8* %Temp5 to i32 store i32 %20, i32 %CastPtrTemp5_int3243 br label %Block4B

Block3B: ; preds = %Block32 %CastPtrTemp5int32 = bitcast i8* %Temp5 to i32, !dbg !​588636 store i32 0, i32 %CastPtrTemp5_int32, !dbg !​588636 br label %Block4B, !dbg !​588636

Block4B: ; preds = %Block3E, %Block3B %CastPtrTemp5 = bitcast i8* %Temp5 to i32, !dbg !​588636 %LdTemp5_ = load i32, i32 %CastPtrTemp5, !dbg !​588636 store i32 %LdTemp5, i32 %hc3local2, !dbg !​588636 %21 = getelementptr i8, i8 %0, i32 24, !dbg !​588637 %LoadeeType11 = load %"[S.P.CoreLib]Internal.Runtime.EEType"*, %"[S.P.CoreLib]Internal.Runtime.EEType" bitcast (i32 @​EETypeInt32SYMBOL to %"[S.P.CoreLib]Internal.Runtime.EEType"), !dbg !​588637 %CastPtraddress_of12 = bitcast i32 %value4arg3 to i8, !dbg !​588637 %22 = getelementptr i8, i8 %21, i32 0, !dbg !​588637 %CastPtrTypedStore13 = bitcast i8 %22 to i8, !dbg !​588637 store i8* %CastPtraddress_of12, i8 %CastPtrTypedStore13, !dbg !​588637 call void @​S_P_CoreLib_System_Runtime_RuntimeExports_RhBox(i8 %21, i8 %Temp6, %"[S.P.CoreLib]Internal.Runtime.EEType" %LoadeeType11), !dbg !​588637 %CastPtrTemp6_ = bitcast i8 %Temp6_ to i8, !dbg !​588637 %LdTemp6 = load i8*, i8** %CastPtrTemp6, !dbg !​588637 %CastInt14 = ptrtoint i8* %LdTemp6_ to i32, !dbg !​588637 %brtrue15 = icmp ne i32 %CastInt14, 0, !dbg !​588637 br i1 %brtrue15, label %Block57, label %Block54, !dbg !​588637

Block57: ; preds = %Block4B %23 = getelementptr i8, i8 %0, i32 40, !dbg !​588637 %CastPtrldloca38 = bitcast i32 %value4arg3 to i8, !dbg !​588637 %24 = getelementptr i8, i8 %23, i32 0, !dbg !​588637 %CastPtrTypedStore39 = bitcast i8* %24 to i8*, !dbg !​588637 store i8 %CastPtrldloca38, i8* %CastPtrTypedStore39, !dbg !​588637 %25 = call i32 @​Int32__GetHashCode(i8 %23), !dbg !​588637 %CastPtrTemp7int3240 = bitcast i8* %Temp7 to i32 store i32 %25, i32 %CastPtrTemp7_int3240 br label %Block64

Block54: ; preds = %Block4B %CastPtrTemp7int32 = bitcast i8* %Temp7 to i32, !dbg !​588637 store i32 0, i32 %CastPtrTemp7_int32, !dbg !​588637 br label %Block64, !dbg !​588637

Block64: ; preds = %Block57, %Block54 %CastPtrTemp7 = bitcast i8* %Temp7 to i32, !dbg !​588637 %LdTemp7_ = load i32, i32 %CastPtrTemp7, !dbg !​588637 store i32 %LdTemp7, i32 %hc4local3, !dbg !​588637 %26 = getelementptr i8, i8 %0, i32 32, !dbg !​588638 %LoadeeType16 = load %"[S.P.CoreLib]Internal.Runtime.EEType"*, %"[S.P.CoreLib]Internal.Runtime.EEType" bitcast (i32 @​EETypeInt32SYMBOL to %"[S.P.CoreLib]Internal.Runtime.EEType"), !dbg !​588638 %CastPtraddress_of17 = bitcast i32 %value5arg4 to i8, !dbg !​588638 %27 = getelementptr i8, i8 %26, i32 0, !dbg !​588638 %CastPtrTypedStore18 = bitcast i8 %27 to i8, !dbg !​588638 store i8* %CastPtraddress_of17, i8 %CastPtrTypedStore18, !dbg !​588638 call void @​S_P_CoreLib_System_Runtime_RuntimeExports_RhBox(i8 %26, i8 %Temp8, %"[S.P.CoreLib]Internal.Runtime.EEType" %LoadeeType16), !dbg !​588638 %CastPtrTemp8_ = bitcast i8 %Temp8_ to i8, !dbg !​588638 %LdTemp8 = load i8*, i8** %CastPtrTemp8, !dbg !​588638 %CastInt19 = ptrtoint i8* %LdTemp8_ to i32, !dbg !​588638 %brtrue20 = icmp ne i32 %CastInt19, 0, !dbg !​588638 br i1 %brtrue20, label %Block71, label %Block6E, !dbg !​588638

Block71: ; preds = %Block64 %28 = getelementptr i8, i8 %0, i32 40, !dbg !​588638 %CastPtrldloca35 = bitcast i32 %value5arg4 to i8, !dbg !​588638 %29 = getelementptr i8, i8 %28, i32 0, !dbg !​588638 %CastPtrTypedStore36 = bitcast i8* %29 to i8*, !dbg !​588638 store i8 %CastPtrldloca35, i8* %CastPtrTypedStore36, !dbg !​588638 %30 = call i32 @​Int32__GetHashCode(i8 %28), !dbg !​588638 %CastPtrTemp9int3237 = bitcast i8* %Temp9 to i32 store i32 %30, i32 %CastPtrTemp9_int3237 br label %Block7E

Block6E: ; preds = %Block64 %CastPtrTemp9int32 = bitcast i8* %Temp9 to i32, !dbg !​588638 store i32 0, i32 %CastPtrTemp9_int32, !dbg !​588638 br label %Block7E, !dbg !​588638

Block7E: ; preds = %Block71, %Block6E %CastPtrTemp9 = bitcast i8* %Temp9 to i32, !dbg !​588638 %LdTemp9_ = load i32, i32 %CastPtrTemp9, !dbg !​588638 store i32 %LdTemp9, i32 %hc5local4, !dbg !​588638 %31 = getelementptr i8, i8 %0, i32 40, !dbg !​588639 %CastPtrldloca = bitcast i32 %v1local5 to i8, !dbg !​588639 %32 = getelementptr i8, i8 %31, i32 0, !dbg !​588639 %CastPtrTypedStore21 = bitcast i8 %32 to i8, !dbg !​588639 store i8* %CastPtrldloca, i8 %CastPtrTypedStore21, !dbg !​588639 %CastPtrldloca22 = bitcast i32 %v2local6 to i8, !dbg !​588639 %33 = getelementptr i8, i8 %31, i32 4, !dbg !​588639 %CastPtrTypedStore23 = bitcast i8 %33 to i8, !dbg !​588639 store i8* %CastPtrldloca22, i8 %CastPtrTypedStore23, !dbg !​588639 %CastPtrldloca24 = bitcast i32 %v3local7 to i8, !dbg !​588639 %34 = getelementptr i8, i8 %31, i32 8, !dbg !​588639 %CastPtrTypedStore25 = bitcast i8 %34 to i8, !dbg !​588639 store i8* %CastPtrldloca24, i8 %CastPtrTypedStore25, !dbg !​588639 %CastPtrldloca26 = bitcast i32 %v4local8 to i8, !dbg !​588639 %35 = getelementptr i8, i8 %31, i32 12, !dbg !​588639 %CastPtrTypedStore27 = bitcast i8 %35 to i8, !dbg !​588639 store i8* %CastPtrldloca26, i8 %CastPtrTypedStore27, !dbg !​588639 call void @​S_P_CoreLib_System_HashCodeInitialize(i8 %31), !dbg !​588639 call void @​llvm.donothing(), !dbg !​588639 %36 = getelementptr i8, i8 %0, i32 40, !dbg !​588640 %Loadloc5_ = load i32, i32 %v1local5, !dbg !​588640 %Loadloc0_ = load i32, i32 %hc1local0, !dbg !​588640 %37 = call i32 @​S_P_CoreLib_System_HashCodeRound(i8 %36, i32 %Loadloc5, i32 %Loadloc0), !dbg !​588640 store i32 %37, i32 %v1local5, !dbg !​588640 %38 = getelementptr i8, i8 %0, i32 40, !dbg !​588641 %Loadloc6_ = load i32, i32 %v2local6, !dbg !​588641 %Loadloc1_ = load i32, i32 %hc2local1, !dbg !​588641 %39 = call i32 @​S_P_CoreLib_System_HashCode__Round(i8 %38, i32 %Loadloc6, i32 %Loadloc1), !dbg !​588641 store i32 %39, i32 %v2local6, !dbg !​588641 %40 = getelementptr i8, i8 %0, i32 40, !dbg !​588642 %Loadloc7_ = load i32, i32 %v3local7, !dbg !​588642 %Loadloc2_ = load i32, i32 %hc3local2, !dbg !​588642 %41 = call i32 @​S_P_CoreLib_System_HashCodeRound(i8 %40, i32 %Loadloc7, i32 %Loadloc2), !dbg !​588642 store i32 %41, i32 %v3local7, !dbg !​588642 %42 = getelementptr i8, i8 %0, i32 40, !dbg !​588643 %Loadloc8_ = load i32, i32 %v4local8, !dbg !​588643 %Loadloc3_ = load i32, i32* %hc4local3, !dbg !​588643 %43 = call i32 @​S_P_CoreLib_System_HashCodeRound(i8 %42, i32 %Loadloc8, i32 %Loadloc3), !dbg !​588643 store i32 %43, i32 %v4local8, !dbg !​588643 %44 = getelementptr i8, i8 %0, i32 40, !dbg !​588644 %Loadloc5_28 = load i32, i32 %v1local5, !dbg !​588644 %Loadloc6_29 = load i32, i32 %v2local6, !dbg !​588644 %Loadloc7_30 = load i32, i32 %v3local7, !dbg !​588644 %Loadloc8_31 = load i32, i32 %v4local8, !dbg !​588644 %45 = call i32 @​S_P_CoreLib_System_HashCode__MixState(i8 %44, i32 %Loadloc5_28, i32 %Loadloc6_29, i32 %Loadloc7_30, i32 %Loadloc8_31), !dbg !​588644 store i32 %45, i32 %hashlocal9, !dbg !​588644 %Loadloc9_ = load i32, i32 %hashlocal9, !dbg !​588645 %add = add i32 %Loadloc9_, 20, !dbg !​588645 store i32 %add, i32 %hashlocal9, !dbg !​588645 %46 = getelementptr i8, i8 %0, i32 40, !dbg !​588646 %Loadloc9_32 = load i32, i32 %hashlocal9, !dbg !​588646 %Loadloc4_ = load i32, i32 %hc5local4, !dbg !​588646 %47 = call i32 @​S_P_CoreLib_System_HashCode__QueueRound(i8 %46, i32 %Loadloc932, i32 %Loadloc4), !dbg !​588646 store i32 %47, i32 %hashlocal9, !dbg !​588646 %48 = getelementptr i8, i8 %0, i32 40, !dbg !​588647 %Loadloc9_33 = load i32, i32 %hashlocal9, !dbg !​588647 %49 = call i32 @​S_P_CoreLib_System_HashCode__MixFinal(i8 %48, i32 %Loadloc9_33), !dbg !​588647 store i32 %49, i32 %hashlocal9, !dbg !​588647 %Loadloc9_34 = load i32, i32 %hashlocal9, !dbg !​588648 store i32 %Loadloc9_34, i32 %local10_, !dbg !​588648 br label %BlockE6, !dbg !​588648

BlockE6: ; preds = %Block7E %Loadloc10 = load i32, i32* %local10, !dbg !​588648 ret i32 %Loadloc10_, !dbg !​588648 }

fhahn commented 4 years ago

After some feedback on the patch I think it might be desirable to unconditionally run the lowering pass for functions that contain matrix intrinsics both in the middle-end and also the backend so llc works without needing to run opt first.

I've put up a patch to add a new attribute that frontends can add to run the matrix lowering on a function: https://reviews.llvm.org/D76857

And one that adds the matrix lowering pass to the backend pipelines: https://reviews.llvm.org/D76858

fhahn commented 4 years ago

I think the issue should be fixed by adding LowerMatrixIntrinsics to the -O0 pipeline. I've put up a patch: https://reviews.llvm.org/D76327

fhahn commented 4 years ago

Bug llvm/llvm-bugzilla-archive#45229 has been marked as a duplicate of this bug.

fhahn commented 4 years ago

Currently the llvm.matrix intrinsics are exclusively lowered in the middle-end, by the LowerMatrixIntrinsics pass. The backends do not know how to lower them.

The build an IR file containing matrix intrinsics with clang, -mllvm -fenable-matrix needs to be passed. Building the attached file using clang -O1 -mllvm -enable-matrix scalarize.ll works as expected.

clang -mllvm -enable-matrix scalarize.ll currently crashes, because LowerMatrixIntrinsics is not added to the -O0 pipeline. I'll fix that.

nicolasvasilache commented 4 years ago

This is probably for Florian Hahn to take a look at but I do not find his bugzilla handle.

nicolasvasilache commented 4 years ago

assigned to @fhahn