llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.14k stars 12.02k forks source link

Incorrect sample profile when braces omitted #29360

Open llvmbot opened 8 years ago

llvmbot commented 8 years ago
Bugzilla Link 28990
Version trunk
OS Linux
Attachments script to demonstrate, incorrect input, correct input, common main function
Reporter LLVM Bugzilla Contributor
CC @vns-mn,@pogo59,@rotateright

Extended Description

Attached are two source variables, bug1.cc and bug2.cc which differ only in the placement of braces around a then-statement. When these braces are omitted, a profile-based sample is incorrect while it is correct when they are present.

Reproduction steps are in bug1.sh create_llvm_prof is available at https://github.com/google/autofdo

The incorrect sample derived from bug1.cc (from llvm-profdata show) is:

Function: _Z8clampSumPdidd: 100803968, 0, 13 sampled lines Samples collected in the function's body { 0: 0 1: 0 2.1: 908152 2.2: 908138 3: 908139 4: 0 5: 908146 6: 908142 7: 908150 9: 908149 10: 908146 11: 908138 13: 4 }

while the correct sample derived from bug2.cc is

Function: _Z8clampSumPdidd: 101434538, 1, 14 sampled lines Samples collected in the function's body { 0: 1 1: 1 2.1: 913831 2.2: 913817 3: 913818 4: 0 5: 913830 6: 913826 7: 0 8: 913834 9: 913831 10: 913827 11: 913821 13: 1 }

where the key differences is the value for relative line 7, it should be 0.

pogo59 commented 8 years ago

What is the downside of making -mllvm -use-unknown-locations=true default?

The debug info size would become larger, and gdb may have jumpy behavior (depending on how gdb handles unknown location).

Fred Riss's patch (D16569) would have a similar effect but should not cause as much increase in debug-info size. I am experimenting with it locally as a way to improve the debugging experience, because the incorrect source locations are a problem for debugging as well as PGO.

If Fred is willing, I'd be happy to commandeer the patch and we would all benefit. (The review has been idle for months so I expect Fred does not have time to carry through with it himself.)

llvmbot commented 8 years ago

Does Google GCC get this right?

In GCC generated code (-O0), "movsd .LC1(%rip), %xmm1" is not at the beginning of the basic block, thus it inherits debug info from the correct instruction.

llvmbot commented 8 years ago

Is it possible to create a regression test out of this?

Do you mean checking if branch instruction is not used during annotation? Yes I could add a unittest for that.

llvmbot commented 8 years ago

What is the downside of making -mllvm -use-unknown-locations=true default?

The debug info size would become larger, and gdb may have jumpy behavior (depending on how gdb handles unknown location).

david-xl commented 8 years ago

Does Google GCC get this right?

david-xl commented 8 years ago

What is the downside of making -mllvm -use-unknown-locations=true default?

llvmbot commented 8 years ago

My previous diagnostic is irrelevant. Sorry about that.

The true problem is that the first instruction in the basic block of the else branch (which is mostly taken) has no debug info, thus it inherits debug info from the previous BB (i.e. line #​10 in bug1.cc and line #​11 in bug2.cc). As the frequency for this instruction is the same as other instructions in else branch, and we use max_of_inst_count to assign source count, #​10/#11 get assigned with incorrect sample count.

The solution is to compile with "-mllvm -use-unknown-locations=true" for profile collection binary.

I also experimented enhancing create_llvm_prof to be smarter when assigning source count. But looks like the benefits does not justify the increased complexity.

david-xl commented 8 years ago

Is it possible to create a regression test out of this?

llvmbot commented 8 years ago

This problem will be fixed if you sync after r278522:


r278522 | dehao | 2016-08-12 09:22:12 -0700 (Fri, 12 Aug 2016) | 9 lines

Fine tuning of sample profile propagation algorithm.

Summary: The refined propagation algorithm is more accurate and robust.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23224

Basically, we will not use samples from branch instruction during annotation. So the incorrect profile at offset#7 will not matter.