Open zcfh opened 9 months ago
Hi @zcfh, sorry for the slow reply.
Your steps look correct. Are you comparing the version that is built with a single end-to-end clang invocation against the version in which you first lower to IR and then lower from IR to executable? Note that the exact configuration of optimizations and options of clang -O3 foo.c -o a.out
cannot be replicated by a standalone opt
invocation. That is a known limitation of LLVM and you can find some discussion of it on the LLVM discourse.
Cheers, Chris
❓ Questions and Help
How to optimize a real project? I tested it on bzip2. Steps I performed:
-Xclang -disable-O0-optnone -Xclang -disable-llvm-passes -O0 -emit-llvm
to generate an origin.bcCompress a small file clang O3 built binary, took 3.70 s, ppo_bzip2 takes 8.9s
Here I tried
llc -O3
again, and the performance was still slightly worse than the binary compiled by O3, and it still took 3.75s.