llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.52k forks source link

odr-violation and, if ignored, SEGV when using ASAN on code emitted from -emit-llvm #32047

Closed llvmbot closed 6 years ago

llvmbot commented 7 years ago
Bugzilla Link 32700
Resolution INVALID
Resolved on Jun 18, 2018 05:56
Version 4.0
OS Linux
Attachments Emitted LL from first testcase hello.clang500.ll, Emitted LL from second testcase hello2.clang500.ll
Reporter LLVM Bugzilla Contributor
CC @pcc,@vitalybuka,@vedantk

Extended Description

Hello

Clang version: bug found when using clang 4, then confirmed with clang 5 (1:5.0~svn294894-1 from Debian)

When compiling in 2 steps (emitting IR code) and using ASAN, I get the following error :

cat > hello.c << EOF

include

int main(void) { printf("hello world\n"); return 0; } EOF clang-5.0 -fsanitize=address -S -emit-llvm hello.c -o hello.clang500.ll clang-5.0 -fsanitize=address hello.clang500.ll -o hello.clang500 ./hello.clang500

==26227==ERROR: AddressSanitizer: odr-violation (0x000000520dc0): [1] size=64 '.str' hello.clang500.ll [2] size=13 '' hello.c:5:12 These globals were registered at these points: [1]:

​0 0x432430 (/tmp/hello.clang500+0x432430)

#&#8203;1 0x50a106  (/tmp/hello.clang500+0x50a106)

[2]:

​0 0x432430 (/tmp/hello.clang500+0x432430)

#&#8203;1 0x50a0f0  (/tmp/hello.clang500+0x50a0f0)

==26227==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY: AddressSanitizer: odr-violation: global '.str' at hello.clang500.ll ==26227==ABORTING

I will attach the emitted LL.

Now if I choose to ignore the odr-validation, I get a SEGV:

cat > hello2.c << EOF

include

int main(int argc, char *argv[]) { printf("Hello %s\n", argv[0]); return 0; } EOF

Now ignoring previous error:

export ASAN_OPTIONS=detect_odr_violation=0 clang-5.0 -fsanitize=address -S -emit-llvm hello2.c -o hello2.clang500.ll clang-5.0 -fsanitize=address hello2.clang500.ll -o hello2.clang500 ./hello2.clang500 ASAN:DEADLYSIGNAL

==26433==ERROR: AddressSanitizer: SEGV on unknown address 0x0200865f1067 (pc 0x00000050a0ce bp 0x7ffd97e81910 sp 0x7ffd97e818e0 T0) ==26433==The signal is caused by a READ memory access.

​0 0x50a0cd (/tmp/hello2.clang500+0x50a0cd)

#&#8203;1 0x7ff1255cb2b0  (/lib/x86_64-linux-gnu/libc.so.6+0x202b0)
#&#8203;2 0x41bef9  (/tmp/hello2.clang500+0x41bef9)

AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/tmp/hello2.clang500+0x50a0cd) ==26433==ABORTING

I will attach the emitted LL.

llvmbot commented 6 years ago

Sure, closing.

vitalybuka commented 6 years ago

Should we close this as invalid according the state of https://reviews.llvm.org/D41222 ?

pcc commented 6 years ago

The alternative being what?

We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.

This is a longer discussion we should move off this bug, but note that there are many things that fall out nicely from invoking via clang. E.g. fission options just work (clang is what invokes objcopy to create the .dwo files). Which isn't to say that shouldn't/can't happen in a new tool. But I'm afraid we would end up duplicating a bunch of logic including target option handling. It's also possible we could just simplify the clang handling for the thinlto backends further (it already goes through a short-circuited path).

In fact, the fix for this bug didn't fix my thinlto failure and I just discovered why. We don't even go through the clang handling that sets up the ASAN passes when we invoke it with the thinlto backend options (we invoke thinBackend() via runThinLTOBackend then return early from EmitBackendOutput).

So the upshot is that it isn't clear that invoking the thinlto backends via clang has anything to do with the failure I am looking at that led me here.

Thanks, Teresa. There are a number of things that I disagree with here, but I'll try to find some time to start a thread on this topic on llvm-dev (and cc you) because I think that it's important that we have the correct model for interfacing between the thin link and thin backend jobs.

llvmbot commented 6 years ago

The alternative being what?

We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.

This is a longer discussion we should move off this bug, but note that there are many things that fall out nicely from invoking via clang. E.g. fission options just work (clang is what invokes objcopy to create the .dwo files). Which isn't to say that shouldn't/can't happen in a new tool. But I'm afraid we would end up duplicating a bunch of logic including target option handling. It's also possible we could just simplify the clang handling for the thinlto backends further (it already goes through a short-circuited path).

In fact, the fix for this bug didn't fix my thinlto failure and I just discovered why. We don't even go through the clang handling that sets up the ASAN passes when we invoke it with the thinlto backend options (we invoke thinBackend() via runThinLTOBackend then return early from EmitBackendOutput).

So the upshot is that it isn't clear that invoking the thinlto backends via clang has anything to do with the failure I am looking at that led me here.

pcc commented 6 years ago

The alternative being what?

We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.

llvmbot commented 6 years ago

Yet another problem that results from using clang to run distributed backend jobs. Can we please consider moving away from that model so that we won't need hacks like this in the asan pass?

The alternative being what?

In any case, the bug was not filed for distributed thinlto backends, I just stumbled on it in that context.

pcc commented 6 years ago

Yet another problem that results from using clang to run distributed backend jobs. Can we please consider moving away from that model so that we won't need hacks like this in the asan pass?

llvmbot commented 6 years ago

It turns out the fix works for all the wrong reasons - it ended up suppressing ASAN. The reason was that there are 2 ASAN passes, a function pass and a module pass. The function being checked is inserted by the (earlier) function pass and we were checking in the module pass, so skipping it even in the first compile.

I found a different ASAN function to check in the module pass, but this doesn't help as we assert during the function pass in the second compile. In the end, I had to check for this ASAN function in 3 places to avoid the second round of ASAN (removing any of the checks results in similar assertions in different places):

AddressSanitizerModule::runOnModule AddressSanitizer::runOnFunction AddressSanitizer::doInitialization

In any case, I'll confirm this works with more testing before sending the patch.

vitalybuka commented 6 years ago

lgtm

llvmbot commented 6 years ago

Yep, that seems like a good suggestion. I tried and the following fixes the issue (slightly different than yours: s/getOrInsertFunction/getFunction/). I will send a patch with this fix and a test for review.

diff --git a/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/lib/Transforms/Instrumentation/AddressSanitizer.cpp index c707dfc0b50..c9143105aed 100644 --- a/lib/Transforms/Instrumentation/AddressSanitizer.cpp +++ b/lib/Transforms/Instrumentation/AddressSanitizer.cpp @@ -2192,6 +2192,9 @@ int AddressSanitizerModule::GetAsanVersion(const Module &M) const { }

bool AddressSanitizerModule::runOnModule(Module &M) {

vedantk commented 6 years ago

I think it makes sense for the ASan pass to be idempotent, as a way to resolve the issue that arises with ThinLTO. Maybe we could achieve that with something simple?, like:

bool AddressSanitizerModule::runOnModule(Module &M) {

llvmbot commented 6 years ago

I am seeing similar issues in ThinLTO distributed builds with ASAN, where we serialize out to bitcode, then later invoke clang again on the bitcode (after ThinLTO indexing).

A few observations: 1) This happens when serializing out to bitcode, not just LLVM assembly 2) As noted by Vedant in Comment #​4, with a debug compiler this actually results in an assertion failure (see details below). With a release compiler you get the odr-violation failure 3) I can avoid it by not passing -fsanitize-address to the second compile (from .bc/.ll to native object), which requires splitting the second compile into separate compile and link steps so that the asan references can be satisfied by passing -fsanitize-address to the link.

Hoping that an ASAN expert can comment on what should be happening here. I suppose it is trying to do ASAN instrumentation on an already ASAN-instrumented IR, and the code there does not handle this well. Specifically, the following line in GlobalsMetadata::init is causing the assertion failure:

 auto *GV = mdconst::extract_or_null<GlobalVariable>(MDN->getOperand(0));

clang-6.0: include/llvm/Support/Casting.h:255: typename cast_retty<X, Y >::ret_type llvm::cast(Y ) [X = llvm::GlobalVariable, Y = llvm::Constant]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed. ...

​9 0x0000000001ce3319 llvm::cast_retty<llvm::GlobalVariable, llvm::Constant>::ret_type llvm::cast<llvm::GlobalVariable, llvm::Constant>(llvm::Constant) include/llvm/Support/Casting.h:255:3

​10 0x0000000004835725 _ZN4llvm7mdconst15extract_or_nullINS_14GlobalVariableERKNS_9MDOperandEEENSt9enable_ifIXsr6detail14IsValidPointerIT_T0_EE5valueEPS7E4typeEOS8 include/llvm/IR/Metadata.h:565:5

​11 0x000000000481e2f9 (anonymous namespace)::GlobalsMetadata::init(llvm::Module&) lib/Transforms/Instrumentation/AddressSanitizer.cpp:431:13

​12 0x000000000481ca71 (anonymous namespace)::AddressSanitizer::doInitialization(llvm::Module&) lib/Transforms/Instrumentation/AddressSanitizer.cpp:2320:9

...

Is there a way to make the ASAN code more robust so it simply ignores already-instrumented IR?

vedantk commented 7 years ago

I'm not sure that this workflow is supported.

Is it possible to run ASan on an *.ll input? I get an assertion failure when I try this (cast<> failure in AddressSanitizer::doInitialization).

llvmbot commented 7 years ago

Note that in the current bugzilla interface, it's not possible to flag a bug as clang v5.0, only v4.0 at most.