Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Duplicated profile data leads to Counter Overflows and incorrect code coverage information #35778

Open Quuxplusone opened 6 years ago

Quuxplusone commented 6 years ago
Bugzilla Link PR36805
Status NEW
Importance P enhancement
Reported by Max Moroz (mmoroz@chromium.org)
Reported on 2018-03-19 12:13:29 -0700
Last modified on 2018-03-28 11:19:35 -0700
Version unspecified
Hardware PC Linux
CC davidxl@google.com, kcc@google.com, llvm-bugs@lists.llvm.org, vsk@apple.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

The issue seems to be a regression of https://bugs.llvm.org/show_bug.cgi?id=23499

TL;DR somehow we are getting duplicated profile data records pointing to the same counter, that leads to counter values being duplicated.

Quuxplusone commented 6 years ago

Vedant, David, could you please take a look?

As far as I understood after reading https://bugs.llvm.org/show_bug.cgi?id=23499, the issue was fixed almost ~3y ago in: https://reviews.llvm.org/rL238335 + https://reviews.llvm.org/rL238351

However, I'm experiencing the same issue now when trying to build unittests in Chrome using r325667.

Quuxplusone commented 6 years ago

Do you have a small reproducer for the issue?

Quuxplusone commented 6 years ago

Unfortunately no, only relatively large unittests binary. I'll see if we can create a smaller one to reproduce that, but not sure how easy that would be.

Quuxplusone commented 6 years ago
I've just compared two profile data records for that function, and one of them
has 0 value in FunctionPointer, while the other has a valid value. Can that be
related to the problem that I got two records pointing to the same counter?

0E 1B 7C 82 47 81 90 73 - NameRef
1F 06 00 00 00 00 00 00 - FuncHash
B8 4C 23 06 00 00 00 00 - CounterPtr
00 00 00 00 00 00 00 00 - FunctionPointer
00 00 00 00 00 00 00 00 - ValuesPtrExp
01 00 00 00 - NumCounters
00 00 - NumValueSites[IPVK_Last+1]
00 00 - Padding?

0E 1B 7C 82 47 81 90 73 - NameRef
1F 06 00 00 00 00 00 00 - FuncHash
B8 4C 23 06 00 00 00 00 - CounterPtr
80 03 DF 03 00 00 00 00 - FunctionPointer
00 00 00 00 00 00 00 00 - ValuesPtrExp
01 00 00 00 - NumCounters
00 00 - NumValueSites[IPVK_Last+1]
00 00 - Padding
Quuxplusone commented 6 years ago

I've tried creating a small reproducer, no luck so far. I can reproduce a situation when I have two profile data blocks with same NameRef and FuncHash values, but CounterPtr is still different. Also, I'm getting the same FunctionPointer value, even though I intentionally redefine the function and compile with -O0.

Quuxplusone commented 6 years ago

Looks like the issue is caused by weird things being done by libxml2. There was an attempt to fix that problem upstream, but the maintainer didn't accept it: http://lists.busybox.net/pipermail/buildroot/2014-September/105880.html

Either way, I'll try to get it fixed in Chrome. I guess we don't have to do anything here, though it might be cool to prevent having multiple profile data pointing to the same counter. Is that possible?

Quuxplusone commented 6 years ago

though it might be cool to prevent having multiple profile data pointing to the same counter. Is that possible?

What do you think about having a warning or even hard error for such cases when two profile data blocks point to the same counter?

Quuxplusone commented 6 years ago
Sorry for the delayed response.

> I've just compared two profile data records for that function, and one of
them has 0 value in FunctionPointer, while the other has a valid value. Can
that be related to the problem that I got two records pointing to the same
counter?

Pointers to functions are only stored in these records when it's feasible to
create an external reference to the function (the exact logic is in
shouldRecordFunctionAddr() in InstrProfiling.cpp).

The linker merged the counter arrays for this function, but not the data
records. The data records should have the same linkage as the counter arrays --
perhaps they weren't merged because their contents are different?

It would be worth digging into why shouldRecordFunctionAddr() gives different
answers for two versions of this function.

Separately, we might be able to work around this problem by never recording
function addresses when value profiling is disabled. I haven't tried this out,
but I'm imagining something like:

diff --git a/lib/Transforms/Instrumentation/InstrProfiling.cpp
b/lib/Transforms/Instrumentation/InstrProfiling.cpp
index 62d67112bb9..88d3a3cedd7 100644
--- a/lib/Transforms/Instrumentation/InstrProfiling.cpp
+++ b/lib/Transforms/Instrumentation/InstrProfiling.cpp
@@ -639,6 +639,14 @@ static std::string getVarName(InstrProfIncrementInst *Inc,
StringRef Prefix) {
 }

 static inline bool shouldRecordFunctionAddr(Function *F) {
+  // If value profiling isn't used in this module, don't bother recording
+  // function addresses.
+  auto *M = F->getParent();
+  if (auto *F = M->getFunction(
+          Intrinsic::getName(llvm::Intrinsic::instrprof_value_profile)))
+    if (F->use_empty())
+      return false;

Max, are you still set up to try that out?
Quuxplusone commented 6 years ago

Vedant, thanks for the proposed fix. I've just tried that, but still see the same problem :/