Open yowl opened 4 years ago
Thanks! I can confirm that with this patch the compile time is back to 3 minutes.
Probably you should be using inttoptr instead of gep on a null pointer.
I have a patch I think should fix this; will post soon.
while I've got you, this array is basically just packing numbers. If there's a more efficient way to do that, over a bunch of geps, I'm all ears.
here's the first few lines of that large array
@__embedded_metadata = global [1713228 x i32] [i32 bitcast (i8 getelementptr (i8, i8 null, i32 -559030275) to i32), i32 bitcast (i8* getelementptr (i8, i8
by "normal" I've not actually measured it, but I think its around 30 seconds.
I'm using libLLVM at the point of finishing the compilation (of c# => LLVM IR) with
Module.PrintToFile(Path.ChangeExtension(_objectFilePath, ".txt"));
The PrintToFile dumps the textual representation and it completes in a normal amount of time given that its 1.5GB.
This method goes to
[DllImport(LibraryPath, CallingConvention = CallingConvention.Cdecl, EntryPoint = "LLVMPrintModuleToFile", ExactSpelling = true)]
[return: NativeTypeName("LLVMBool")]
public static extern int PrintModuleToFile([NativeTypeName("LLVMModuleRef")] LLVMOpaqueModule* M, [NativeTypeName("const char *")] sbyte* Filename, [NativeTypeName("char **")] sbyte** ErrorMessage);
i.e. LLVMPrintModuleToFile in libLLVM
There's some even bigger ones, but vim is having a hard time with the length of the lines
The POSIX fold command should be able to handle arbitrarily long lines.
How are you dumping the IR, though? If you're using llvm-dis, how long did it take?
Yes, there are quite a few e.g.
@FrozenSegmentRegionStart = global [278553 x i32] [i32 null, i32 bitcast ([18 x i32]* @EEType_String to i32), i32 null, i32 null, i32 null
There's some even bigger ones, but vim is having a hard time with the length of the lines, but there's at least one with 1.7 million combinations of bitcast(i8* getelementptr....
If it's taking hours to read a 240MB bitcode file, there must be some non-linear algorithm involved. Looking at BitcodeReaderValueList::resolveConstantForwardRefs , I can only come up with one potential source of non-linearity: calling replaceAllUsesWith() on every element of a large array could be O(N^2). (It looks like the original author realized this was a possibility, which is why it doesn't RAUW the placeholder itself. But the reasoning was applied incompletely.)
Does the testcase have an array with a lot of constant expression operands somewhere?
It's possible we could also change the way we write bitcode files to avoid forward references more aggressively; not sure how the ordering works off the top of my head.
Thanks for the report! Yes, I was able to reproduce this problem. It would be great if someone more familiar with this part of the code could take a look, though :)
Extended Description
I create bitcode using libLLVM for the corert compiler project (https://github.com/dotnet/corert). It uses the c# bindings over libLLVM from https://github.com/Microsoft/LLVMSharp.
I have 2 bitcodes generated from mostly the same source code. They are around 240MB in size. One compiles in 3 minutes, the other in 12 hours. I suspect the 12 hour compilation is either not optimal or doing something wrong. I use emscripten to compile and this ultimately calls
E:/GitHub/llvm-project/build/release/bin/clang++.exe -target wasm32-unknown-emscripten -DEMSCRIPTEN_major=1 -DEMSCRIPTEN_minor=39 -DEMSCRIPTEN_tiny__=19 -D_LIBCPP_ABI_VERSION=2 -Dunix -Dunix -Dunix -Werror=implicit-function-declaration -Xclang -nostdsysteminc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libcxx -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libcxxabi\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libunwind\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\compat -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libc\musl\arch\emscripten -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\local\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SSE -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\cache\wasm\include -DEMSCRIPTEN -fignore-exceptions -c -g E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt.Wasm.bc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SDL -c -o E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt-release.o -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -g
What I've noticed is that compared to the "fast", 3 minute compile, the "slow" compile makes around 1 million calls to https://github.com/llvm/llvm-project/blob/a6eb70c052da767aef6b041d0db20bdf3a9e06b5/llvm/lib/Bitcode/Reader/ValueList.cpp#L89 and hence the ResolveConstants variable ends up with that many entries. Resolving these constants is then what seems to take most of the time. I think the bitcode reader is identifying 1 million forward references so possible causes of the slowness that come to mind are:
A copy of the bitcode is at http://dev.hubse.com/UnoCoreRt.Wasm.bc.msi (its not really an msi, just needed a binary extension that the web server would serve). File is actually a .7z compressed file, so needs renaming from .msi to .7z
I privately messaged @tlively in discord and I believe he has confirmed that it takes a long time for him also.
llvm/llvm-project#46095 looks to be the same area of code, but not the same problem.
I did spend a bit of time with clang++ in the debugger, but I'm not that familiar with it at all, so I couldn't make any conclusion about my 3 theories above.