Open llvmbot opened 8 years ago
What is the downside of making -mllvm -use-unknown-locations=true default?
The debug info size would become larger, and gdb may have jumpy behavior (depending on how gdb handles unknown location).
Fred Riss's patch (D16569) would have a similar effect but should not cause as much increase in debug-info size. I am experimenting with it locally as a way to improve the debugging experience, because the incorrect source locations are a problem for debugging as well as PGO.
If Fred is willing, I'd be happy to commandeer the patch and we would all benefit. (The review has been idle for months so I expect Fred does not have time to carry through with it himself.)
Does Google GCC get this right?
In GCC generated code (-O0), "movsd .LC1(%rip), %xmm1" is not at the beginning of the basic block, thus it inherits debug info from the correct instruction.
Is it possible to create a regression test out of this?
Do you mean checking if branch instruction is not used during annotation? Yes I could add a unittest for that.
What is the downside of making -mllvm -use-unknown-locations=true default?
The debug info size would become larger, and gdb may have jumpy behavior (depending on how gdb handles unknown location).
Does Google GCC get this right?
What is the downside of making -mllvm -use-unknown-locations=true default?
My previous diagnostic is irrelevant. Sorry about that.
The true problem is that the first instruction in the basic block of the else branch (which is mostly taken) has no debug info, thus it inherits debug info from the previous BB (i.e. line #10 in bug1.cc and line #11 in bug2.cc). As the frequency for this instruction is the same as other instructions in else branch, and we use max_of_inst_count to assign source count, #10/#11 get assigned with incorrect sample count.
The solution is to compile with "-mllvm -use-unknown-locations=true" for profile collection binary.
I also experimented enhancing create_llvm_prof to be smarter when assigning source count. But looks like the benefits does not justify the increased complexity.
Is it possible to create a regression test out of this?
This problem will be fixed if you sync after r278522:
r278522 | dehao | 2016-08-12 09:22:12 -0700 (Fri, 12 Aug 2016) | 9 lines
Fine tuning of sample profile propagation algorithm.
Summary: The refined propagation algorithm is more accurate and robust.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23224
Basically, we will not use samples from branch instruction during annotation. So the incorrect profile at offset#7 will not matter.
Extended Description
Attached are two source variables, bug1.cc and bug2.cc which differ only in the placement of braces around a then-statement. When these braces are omitted, a profile-based sample is incorrect while it is correct when they are present.
Reproduction steps are in bug1.sh create_llvm_prof is available at https://github.com/google/autofdo
The incorrect sample derived from bug1.cc (from llvm-profdata show) is:
Function: _Z8clampSumPdidd: 100803968, 0, 13 sampled lines Samples collected in the function's body { 0: 0 1: 0 2.1: 908152 2.2: 908138 3: 908139 4: 0 5: 908146 6: 908142 7: 908150 9: 908149 10: 908146 11: 908138 13: 4 }
while the correct sample derived from bug2.cc is
Function: _Z8clampSumPdidd: 101434538, 1, 14 sampled lines Samples collected in the function's body { 0: 1 1: 1 2.1: 913831 2.2: 913817 3: 913818 4: 0 5: 913830 6: 913826 7: 0 8: 913834 9: 913831 10: 913827 11: 913821 13: 1 }
where the key differences is the value for relative line 7, it should be 0.