Closed llvmbot closed 6 years ago
Sure, closing.
Should we close this as invalid according the state of https://reviews.llvm.org/D41222 ?
The alternative being what?
We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.
This is a longer discussion we should move off this bug, but note that there are many things that fall out nicely from invoking via clang. E.g. fission options just work (clang is what invokes objcopy to create the .dwo files). Which isn't to say that shouldn't/can't happen in a new tool. But I'm afraid we would end up duplicating a bunch of logic including target option handling. It's also possible we could just simplify the clang handling for the thinlto backends further (it already goes through a short-circuited path).
In fact, the fix for this bug didn't fix my thinlto failure and I just discovered why. We don't even go through the clang handling that sets up the ASAN passes when we invoke it with the thinlto backend options (we invoke thinBackend() via runThinLTOBackend then return early from EmitBackendOutput).
So the upshot is that it isn't clear that invoking the thinlto backends via clang has anything to do with the failure I am looking at that led me here.
Thanks, Teresa. There are a number of things that I disagree with here, but I'll try to find some time to start a thread on this topic on llvm-dev (and cc you) because I think that it's important that we have the correct model for interfacing between the thin link and thin backend jobs.
The alternative being what?
We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.
This is a longer discussion we should move off this bug, but note that there are many things that fall out nicely from invoking via clang. E.g. fission options just work (clang is what invokes objcopy to create the .dwo files). Which isn't to say that shouldn't/can't happen in a new tool. But I'm afraid we would end up duplicating a bunch of logic including target option handling. It's also possible we could just simplify the clang handling for the thinlto backends further (it already goes through a short-circuited path).
In fact, the fix for this bug didn't fix my thinlto failure and I just discovered why. We don't even go through the clang handling that sets up the ASAN passes when we invoke it with the thinlto backend options (we invoke thinBackend() via runThinLTOBackend then return early from EmitBackendOutput).
So the upshot is that it isn't clear that invoking the thinlto backends via clang has anything to do with the failure I am looking at that led me here.
The alternative being what?
We provide a relatively simple program whose command line syntax is exactly what is required to run backend jobs. If we wanted, we could expose this program via the clang driver so that users don't require multiple binaries.
Yet another problem that results from using clang to run distributed backend jobs. Can we please consider moving away from that model so that we won't need hacks like this in the asan pass?
The alternative being what?
In any case, the bug was not filed for distributed thinlto backends, I just stumbled on it in that context.
Yet another problem that results from using clang to run distributed backend jobs. Can we please consider moving away from that model so that we won't need hacks like this in the asan pass?
It turns out the fix works for all the wrong reasons - it ended up suppressing ASAN. The reason was that there are 2 ASAN passes, a function pass and a module pass. The function being checked is inserted by the (earlier) function pass and we were checking in the module pass, so skipping it even in the first compile.
I found a different ASAN function to check in the module pass, but this doesn't help as we assert during the function pass in the second compile. In the end, I had to check for this ASAN function in 3 places to avoid the second round of ASAN (removing any of the checks results in similar assertions in different places):
AddressSanitizerModule::runOnModule AddressSanitizer::runOnFunction AddressSanitizer::doInitialization
In any case, I'll confirm this works with more testing before sending the patch.
lgtm
Yep, that seems like a good suggestion. I tried and the following fixes the issue (slightly different than yours: s/getOrInsertFunction/getFunction/). I will send a patch with this fix and a test for review.
diff --git a/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/lib/Transforms/Instrumentation/AddressSanitizer.cpp index c707dfc0b50..c9143105aed 100644 --- a/lib/Transforms/Instrumentation/AddressSanitizer.cpp +++ b/lib/Transforms/Instrumentation/AddressSanitizer.cpp @@ -2192,6 +2192,9 @@ int AddressSanitizerModule::GetAsanVersion(const Module &M) const { }
bool AddressSanitizerModule::runOnModule(Module &M) {
I think it makes sense for the ASan pass to be idempotent, as a way to resolve the issue that arises with ThinLTO. Maybe we could achieve that with something simple?, like:
bool AddressSanitizerModule::runOnModule(Module &M) {
I am seeing similar issues in ThinLTO distributed builds with ASAN, where we serialize out to bitcode, then later invoke clang again on the bitcode (after ThinLTO indexing).
A few observations: 1) This happens when serializing out to bitcode, not just LLVM assembly 2) As noted by Vedant in Comment #4, with a debug compiler this actually results in an assertion failure (see details below). With a release compiler you get the odr-violation failure 3) I can avoid it by not passing -fsanitize-address to the second compile (from .bc/.ll to native object), which requires splitting the second compile into separate compile and link steps so that the asan references can be satisfied by passing -fsanitize-address to the link.
Hoping that an ASAN expert can comment on what should be happening here. I suppose it is trying to do ASAN instrumentation on an already ASAN-instrumented IR, and the code there does not handle this well. Specifically, the following line in GlobalsMetadata::init is causing the assertion failure:
auto *GV = mdconst::extract_or_null<GlobalVariable>(MDN->getOperand(0));
clang-6.0: include/llvm/Support/Casting.h:255: typename cast_retty<X, Y >::ret_type llvm::cast(Y ) [X = llvm::GlobalVariable, Y = llvm::Constant]: Assertion `isa
...
Is there a way to make the ASAN code more robust so it simply ignores already-instrumented IR?
I'm not sure that this workflow is supported.
Is it possible to run ASan on an *.ll input? I get an assertion failure when I try this (cast<> failure in AddressSanitizer::doInitialization).
Note that in the current bugzilla interface, it's not possible to flag a bug as clang v5.0, only v4.0 at most.
Extended Description
Hello
Clang version: bug found when using clang 4, then confirmed with clang 5 (1:5.0~svn294894-1 from Debian)
When compiling in 2 steps (emitting IR code) and using ASAN, I get the following error :
cat > hello.c << EOF
include
int main(void) { printf("hello world\n"); return 0; } EOF clang-5.0 -fsanitize=address -S -emit-llvm hello.c -o hello.clang500.ll clang-5.0 -fsanitize=address hello.clang500.ll -o hello.clang500 ./hello.clang500
==26227==ERROR: AddressSanitizer: odr-violation (0x000000520dc0): [1] size=64 '.str' hello.clang500.ll [2] size=13 '' hello.c:5:12
These globals were registered at these points:
[1]:
0 0x432430 (/tmp/hello.clang500+0x432430)
[2]:
0 0x432430 (/tmp/hello.clang500+0x432430)
==26227==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY: AddressSanitizer: odr-violation: global '.str' at hello.clang500.ll ==26227==ABORTING
I will attach the emitted LL.
Now if I choose to ignore the odr-validation, I get a SEGV:
cat > hello2.c << EOF
include
int main(int argc, char *argv[]) { printf("Hello %s\n", argv[0]); return 0; } EOF
Now ignoring previous error:
export ASAN_OPTIONS=detect_odr_violation=0 clang-5.0 -fsanitize=address -S -emit-llvm hello2.c -o hello2.clang500.ll clang-5.0 -fsanitize=address hello2.clang500.ll -o hello2.clang500 ./hello2.clang500 ASAN:DEADLYSIGNAL
==26433==ERROR: AddressSanitizer: SEGV on unknown address 0x0200865f1067 (pc 0x00000050a0ce bp 0x7ffd97e81910 sp 0x7ffd97e818e0 T0) ==26433==The signal is caused by a READ memory access.
0 0x50a0cd (/tmp/hello2.clang500+0x50a0cd)
AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/tmp/hello2.clang500+0x50a0cd) ==26433==ABORTING
I will attach the emitted LL.