llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.51k stars 11.3k forks source link

[SystemZ][NewPM] huge loop expansion with opt -O1 #51488

Open cuviper opened 2 years ago

cuviper commented 2 years ago
Bugzilla Link 52146
Version trunk
OS Linux
Attachments bitcode from rust#89609
CC @alex

Extended Description

xref: https://github.com/rust-lang/rust/issues/89609

After enabling the new pass manager in rustc, one particular crate showed a large increase in compile time, but only when targeting s390x. I attached the bitcode from rustc, and I can reproduce the problem with opt -O1 on main.

$ llvm-dis <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc | wc -lL 157863 2306

$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S | wc -lL 794414 4515800

real 2m4.580s user 2m4.675s sys 0m0.221s

There are two instances of that longest 4.5M line length, and both are huge phi nodes just before invoking a function ending in 17hde8a472161ebd31bE (so you can search for that). The predecessors look like a mass of loopexit.split-lp expansions gone rogue.

The same input behaves reasonably with the old PM, both optimizing faster and resulting in a smaller bitcode.

$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S --enable-new-pm=0 | wc -lL 148220 2306

real 0m2.266s user 0m2.322s sys 0m0.024s

If you edit out the "target-cpu"="z10" attributes in the IR, this reproducer can also be forced to -mtriple x86_64, and that runs fine with new PM.

cuviper commented 2 years ago

As mentioned in bug 45253 comment 11, https://reviews.llvm.org/D98481 seems to solve the problem by restricting the inliner.

$ time opt <rustc_ast_lowering-cgu.0.rcgu.thin-lto-after-patch.bc -O1 -S | wc -lL 216583 2306

real 0m4.597s user 0m4.663s sys 0m0.037s

Still slower than old-pm, but it's acceptable.

cuviper commented 2 years ago

@nikic pointed out that SystemZ has an inlining threshold multiplier of 3, which would explain why we didn't see this on other targets. https://github.com/rust-lang/rust/issues/89609#issuecomment-1007338317