Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

verifyFunction spends too much time validating !dbg metadata in +asserts builds #46681

Open Quuxplusone opened 4 years ago

Quuxplusone commented 4 years ago
Bugzilla Link PR47712
Status NEW
Importance P enhancement
Reported by Reid Kleckner (rnk@google.com)
Reported on 2020-10-02 11:58:06 -0700
Last modified on 2021-01-13 02:36:36 -0800
Version trunk
Hardware PC Windows NT
CC aprantl@apple.com, florian_hahn@apple.com, hans@chromium.org, llvm-bugs@lists.llvm.org, nicolasweber@gmx.de, spatel+llvm@rotateright.com, vsk@apple.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR48689
I was profiling LLVM for other reasons and accidentally used a build with
assertions enabled. What I discovered is that, if you compile PassBuilder.cpp
with assertions, most of the time is spent in the verifier, mostly due to
asserts in SLPVectorizer and LoopVectorizer. There are 187,486 samples overall,
and 131,698 of them (70.2%) are inside calls to verifyFunction.

If you drill into what verifyFunction is doing, all the time is spent in
recursive calls to visitMDNode. These calls are rooted in visitInstruction
calls to visitMDNode:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/IR/Verifier.cpp#L4431

visitMDNode in turn validates each of its operands. This is undesirable when
validating a function: location metadata points upwards, so validating one
location will ultimately validate the entire metadata graph rooted in the
compile unit. See the tree of calls in this screenshot:
https://reviews.llvm.org/file/data/u5ms43qqwl3aahxopjql/PHID-FILE-ysoomivlg3fixz52f4ny/visitmdnode-profile.png

The code I linked was added in http://github.com/llvm/llvm-
project/commit/8dfe819bcd23, but I'm not sure when this regressed. We used to
ship clang with assertions enabled, so we saw these non-linear assert compile
time issues in the field as they arose. Now that we are no longer watching
carefully, it seems like these issues can slip in.

Should we try to establish the invariant that llvm::verifyFunction runs in
O(#instructions), or should we move all calls to verifyFunction to
EXPENSIVE_CHECKS?
Quuxplusone commented 3 years ago

(In reply to Reid Kleckner from comment #0)

Should we try to establish the invariant that llvm::verifyFunction runs in O(#instructions), or should we move all calls to verifyFunction to EXPENSIVE_CHECKS?

I think verifyFunction is too valuable for builds with assertions to put it behind EXPENSIVE_CHECKS.

Limiting assertions outside EXPENSIVE_CHECKS to be O(#instructions) seems very reasonable and a natural cut-off for non-expensive checks. (I think plenty of the current uses of EXPENSIVE_CHECKS already follow a similar reasoning). Might be good to spell it out somewhere. We probably should limit it to #instruction the pass operates on, e.g. a loop pass should not verify the whole function for each loop in a function.

As for the issue at hand in verifyFunction, I think we could limit the expensive MD verification to EXPENSIVE_CHECKS to start with.

Quuxplusone commented 3 years ago

A potentially related issue motivated another patch: https://reviews.llvm.org/D94576

Reid, do you happen to still have the file & flags to reproduce this? I tried with a recent Clang build & current main, but couldn't reproduce it by building PassBuilder.cpp

Quuxplusone commented 3 years ago

(In reply to Florian Hahn from comment #2)

Reid, do you happen to still have the file & flags to reproduce this? I tried with a recent Clang build & current main, but couldn't reproduce it by building PassBuilder.cpp

Nevermind, this can be reproduced by building PassBuilder.cpp with -O3 and -g. Time spent in verifyFunction goes from >0.5% of total compile-time to ~11% total compile-time. Here's the trace, with total time spent & percentage of total compile-time.

8.91 s 12.7% 3.00 ms (anonymous namespace)::LoopVectorize::runOnFunction(llvm::Function&) 8.90 s 12.7% 0 s llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo&, llvm::TargetLibraryInfo, llvm::DemandedBits&, llvm::AAResults&, llvm::AssumptionCache&, std::__1::function<llvm::LoopAccessInfo const& (llvm::Loop&)>&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo) 8.89 s 12.7% 2.00 ms llvm::LoopVectorizePass::processLoop(llvm::Loop) 8.61 s 12.3% 0 s llvm::verifyFunction(llvm::Function const&, llvm::raw_ostream) 8.54 s 12.2% 66.00 ms (anonymous namespace)::Verifier::verify(llvm::Function const&) 7.97 s 11.4% 0 s (anonymous namespace)::Verifier::visitMDNode(llvm::MDNode const&) 7.97 s 11.4% 12.00 ms (anonymous namespace)::Verifier::visitMDNode(llvm::MDNode const&) 7.96 s 11.4% 30.00 ms (anonymous namespace)::Verifier::visitMDNode(llvm::MDNode const&) 7.93 s 11.3% 71.00 ms (anonymous namespace)::Verifier::visitMDNode(llvm::MDNode const&) 1.00 ms 0.0% 0 s ml_set_interrupts_enabled 1.00 ms 0.0% 0 s (anonymous namespace)::Verifier::verifySourceDebugInfo(llvm::DICompileUnit const&, llvm::DIFile const&) 347.00 ms 0.4% 2.00 ms (anonymous namespace)::Verifier::visitCallInst(llvm::CallInst&)

I also get similar results with http://github.com/llvm/llvm-project/commit/8dfe819bcd23 reverted.