When we migrated the llvm backend optimization pipeline from being applied by opt to being applied by llvm-kompile-codegen, we made an error. The code generator was not, in fact, applying the opt transformation pipeline correctly. This means that kompile -O2 will not, in fact, apply llc -O2 to the bitcode generated by the llvm backend, leading to a significant performance regression.
While this PR does not entirely fix the issue (an upstream change to the K frontend is also required), it fixes the issue within this repository. We see a roughly 1.5x speedup in runtime when -O2 is passed to kompile. We do lose something in compilation time, however. This is normal; compiling with optimizations is more expensive, we just weren't doing it correctly before.
The main changes in this pull request are threefold:
Convert optimizer code to new pass manager.
Make sure to run the middle-end optimizer.
Convert TailCallElimination and Mem2Reg from required to optional passes on -O0. In order to make the tail call optimization still work correctly without TailCallElimination manually marking tail calls as tail, we instead explicitly mark them as musttail in the IR generated by the code generator.
When we migrated the llvm backend optimization pipeline from being applied by
opt
to being applied byllvm-kompile-codegen
, we made an error. The code generator was not, in fact, applying theopt
transformation pipeline correctly. This means thatkompile -O2
will not, in fact, applyllc -O2
to the bitcode generated by the llvm backend, leading to a significant performance regression.While this PR does not entirely fix the issue (an upstream change to the K frontend is also required), it fixes the issue within this repository. We see a roughly 1.5x speedup in runtime when
-O2
is passed to kompile. We do lose something in compilation time, however. This is normal; compiling with optimizations is more expensive, we just weren't doing it correctly before.The main changes in this pull request are threefold:
TailCallElimination
andMem2Reg
from required to optional passes on -O0. In order to make the tail call optimization still work correctly withoutTailCallElimination
manually marking tail calls astail
, we instead explicitly mark them asmusttail
in the IR generated by the code generator.