llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.8k stars 11.91k forks source link

Duplicated profile data leads to Counter Overflows and incorrect code coverage information #36153

Open Dor1s opened 6 years ago

Dor1s commented 6 years ago
Bugzilla Link 36805
Version unspecified
OS Linux
CC @vns-mn,@kcc,@vedantk

Extended Description

The issue seems to be a regression of llvm/llvm-project#23873

TL;DR somehow we are getting duplicated profile data records pointing to the same counter, that leads to counter values being duplicated.

Dor1s commented 6 years ago

Vedant, thanks for the proposed fix. I've just tried that, but still see the same problem :/

vedantk commented 6 years ago

Sorry for the delayed response.

I've just compared two profile data records for that function, and one of them has 0 value in FunctionPointer, while the other has a valid value. Can that be related to the problem that I got two records pointing to the same counter?

Pointers to functions are only stored in these records when it's feasible to create an external reference to the function (the exact logic is in shouldRecordFunctionAddr() in InstrProfiling.cpp).

The linker merged the counter arrays for this function, but not the data records. The data records should have the same linkage as the counter arrays -- perhaps they weren't merged because their contents are different?

It would be worth digging into why shouldRecordFunctionAddr() gives different answers for two versions of this function.

Separately, we might be able to work around this problem by never recording function addresses when value profiling is disabled. I haven't tried this out, but I'm imagining something like:

diff --git a/lib/Transforms/Instrumentation/InstrProfiling.cpp b/lib/Transforms/Instrumentation/InstrProfiling.cpp index 62d67112bb9..88d3a3cedd7 100644 --- a/lib/Transforms/Instrumentation/InstrProfiling.cpp +++ b/lib/Transforms/Instrumentation/InstrProfiling.cpp @@ -639,6 +639,14 @@ static std::string getVarName(InstrProfIncrementInst *Inc, StringRef Prefix) { }

static inline bool shouldRecordFunctionAddr(Function *F) {

Max, are you still set up to try that out?

Dor1s commented 6 years ago

though it might be cool to prevent having multiple profile data pointing to the same counter. Is that possible?

What do you think about having a warning or even hard error for such cases when two profile data blocks point to the same counter?

Dor1s commented 6 years ago

Looks like the issue is caused by weird things being done by libxml2. There was an attempt to fix that problem upstream, but the maintainer didn't accept it: http://lists.busybox.net/pipermail/buildroot/2014-September/105880.html

Either way, I'll try to get it fixed in Chrome. I guess we don't have to do anything here, though it might be cool to prevent having multiple profile data pointing to the same counter. Is that possible?

Dor1s commented 6 years ago

I've tried creating a small reproducer, no luck so far. I can reproduce a situation when I have two profile data blocks with same NameRef and FuncHash values, but CounterPtr is still different. Also, I'm getting the same FunctionPointer value, even though I intentionally redefine the function and compile with -O0.

Dor1s commented 6 years ago

I've just compared two profile data records for that function, and one of them has 0 value in FunctionPointer, while the other has a valid value. Can that be related to the problem that I got two records pointing to the same counter?

0E 1B 7C 82 47 81 90 73 - NameRef 1F 06 00 00 00 00 00 00 - FuncHash B8 4C 23 06 00 00 00 00 - CounterPtr 00 00 00 00 00 00 00 00 - FunctionPointer 00 00 00 00 00 00 00 00 - ValuesPtrExp 01 00 00 00 - NumCounters 00 00 - NumValueSites[IPVK_Last+1] 00 00 - Padding?

0E 1B 7C 82 47 81 90 73 - NameRef 1F 06 00 00 00 00 00 00 - FuncHash B8 4C 23 06 00 00 00 00 - CounterPtr 80 03 DF 03 00 00 00 00 - FunctionPointer 00 00 00 00 00 00 00 00 - ValuesPtrExp 01 00 00 00 - NumCounters 00 00 - NumValueSites[IPVK_Last+1] 00 00 - Padding

Dor1s commented 6 years ago

Unfortunately no, only relatively large unittests binary. I'll see if we can create a smaller one to reproduce that, but not sure how easy that would be.

vedantk commented 6 years ago

Do you have a small reproducer for the issue?

Dor1s commented 6 years ago

Vedant, David, could you please take a look?

As far as I understood after reading llvm/llvm-project#23873 , the issue was fixed almost ~3y ago in: https://reviews.llvm.org/rL238335 + https://reviews.llvm.org/rL238351

However, I'm experiencing the same issue now when trying to build unittests in Chrome using r325667.