Closed ehsan closed 2 years ago
mentioned in issue llvm/llvm-bugzilla-archive#21071
MT and MD in the same process, yikes! :) Great to hear you've solved this mystery. The driver patch is indeed welcome.
I totally agree the behavior on assertions should be better, that's llvm/llvm-bugzilla-archive#21071 .
That being said, I think all the remaining action items are tracked elsewhere (#20931 , llvm/llvm-bugzilla-archive#21071 , D5764) so closing.
This bug has been marked as a duplicate of bug llvm/llvm-bugzilla-archive#21071
Here's what was going wrong. firefox.exe is a small executable that is built with -MT by default which loads a number of other DLLs that are all built with -MD. When it was starting up, the ASAN runtime tried to intercept memset, but neither msvcr110.dll or msvcr120.dll were loaded yet, so the interception failed, and we'd get into this infinite loop.
For Firefox, switching firefox.exe to be built with -MD fixed the issue. But I think we should try to come up with a better failure path than a startup hang. I observed the exact same problem when building with -MTd/-MDd because we'd again fail to find the correct runtime library at startup, and http://reviews.llvm.org/D5764 is an attempt to address that by failing those compilations from succeeding in the first place. But setups such as Firefox's may be harder to detect at compile time.
Can we make OverrideFunction fail harder? If it's indeed expected to fail sometimes in supported configurations (which I believe is the case) can we make intercepting functions in InitializeAsanInterceptors fail hard by calling abort()?
Is this an /MD runtime? If so, can you add a __debugbreak() on both possible failures in OverrideFunction (lib/interception/interception_win.cc:207) ?
Examining things a bit further, it seems that none of the __interception::real_foo pointers have been initialized correctly. Could this be that we have somehow bailed out early from InitializeAsanInterceptors because was_called_once was true?
I couldn't grab a useful stack frame because of bug 21241
You don't need RelWithDebInfo to grab a stack trace. Naturally, I don't use that configuration yet I'm able to get stack traces :)
Why do you think you need it? I think RTL is always built with /Zi, so "Release" should be just enough.
Ah, sorry, I was using a version of the RTL which didn't have debug info, so I wasn't getting a symbolicated stack (no idea where I got that RTL from! :) But the one in my latest LLVM objdir seems to have the symbols, so here's a stack trace with that:
clang_rt.asan_dynamic-i386.dll!sanitizer::internal_sched_yield() Line 363 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::StaticSpinMutex::LockSlow() Line 57 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::Symbolizer::GetOrInit() Line 21 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::StackTrace::PrintStack(const unsigned long addr, unsigned long size) Line 39 C++ clang_rt.asan_dynamic-i386.dll!__asan::AsanCheckFailed(const char file, int line, const char cond, unsigned int64 v1, unsigned int64 v2) Line 70 C++ clang_rt.asan_dynamic-i386.dll!__sanitizer::CheckFailed(const char file, int line, const char cond, unsigned int64 v1, unsigned int64 v2) Line 74 C++ clang_rt.asan_dynamic-i386.dll!asan::PoisonShadow(unsigned long addr, unsigned long size, unsigned char value) Line 29 C++ clang_rt.asan_dynamic-i386.dll!asan::OnLowLevelAllocate(unsigned long ptr, unsigned long size) Line 361 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::LowLevelAllocator::Allocate(unsigned long size) Line 124 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::Symbolizer::PlatformInit() Line 122 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::Symbolizer::GetOrInit() Line 23 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::StackTrace::PrintStack(const unsigned long addr, unsigned long size) Line 39 C++ clang_rt.asan_dynamic-i386.dll!asan::AsanCheckFailed(const char file, int line, const char cond, unsigned int64 v1, unsigned int64 v2) Line 70 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::CheckFailed(const char file, int line, const char cond, unsigned int64 v1, unsigned int64 v2) Line 74 C++ clang_rt.asan_dynamic-i386.dll!asan::PoisonShadow(unsigned long addr, unsigned long size, unsigned char value) Line 29 C++ clang_rt.asan_dynamic-i386.dll!asan::OnLowLevelAllocate(unsigned long ptr, unsigned long size) Line 361 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::LowLevelAllocator::Allocate(unsigned long size) Line 124 C++ clang_rt.asan_dynamic-i386.dll!asan::GetAsanThreadContext(unsigned int tid) Line 52 C++ clang_rt.asan_dynamic-i386.dll!sanitizer::ThreadRegistry::CreateThread(unsigned long user_id, bool detached, unsigned int parent_tid, void * arg) Line 132 C++ clang_rt.asan_dynamic-i386.dll!asan::AsanInitInternal() Line 681 C++ clang_rt.asan_dynamic-i386.dll!asan::Allocate(unsigned long size, unsigned long alignment, sanitizer::StackTrace stack, asan::AllocType alloc_type, bool can_fill) Line 271 C++ clang_rt.asan_dynamic-i386.dll!asan::asan_calloc(unsigned long nmemb, unsigned long size, __sanitizer::StackTrace stack) Line 601 C++ clang_rt.asan_dynamic-i386.dll!calloc(unsigned int nmemb, unsigned int size) Line 70 C++ clang_rt.asan_dynamic-i386.dll!_calloc_impl(unsigned int nmemb, unsigned int size, int errno_tmp) Line 80 C++ clang_rt.asan_dynamic-i386.dll!_calloc_crt(unsigned int count, unsigned int size) Line 62 C clang_rt.asan_dynamic-i386.dll!_mtinit() Line 115 C clang_rt.asan_dynamic-i386.dll!_CRT_INIT(void hDllHandle, unsigned long dwReason, void lpreserved) Line 102 C clang_rt.asan_dynamic-i386.dll!__DllMainCRTStartup(void hDllHandle, unsigned long dwReason, void lpreserved) Line 371 C clang_rt.asan_dynamic-i386.dll!_DllMainCRTStartup(void hDllHandle, unsigned long dwReason, void * lpreserved) Line 340 C ntdll.dll!_LdrpCallInitRoutine@16() Unknown ntdll.dll!_LdrpRunInitializeRoutines@4() Unknown ntdll.dll!_LdrpInitializeProcess@8() Unknown ntdll.dll!__LdrpInitialize@8() Unknown ntdll.dll!_LdrInitializeThunk@8() Unknown
It seems to be identical with the one in comment 0.
==4716==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((__interception::real_memset)) != (0)" (0x0, 0x0) I can build small test programs fine, but this happens on Firefox.
Are you sure it's a memset problem? The reason I've asked for a log is that on large programs (like Firefox), the RTL might not be able to allocate 1/8 of the address space for the shadow memory region. In that case, however, it should print out a different error message.
Well, I don't understand the ASAN initialization sequence very well, but it seems like we are trying to initialize the ASAN runtime off of an attempt to allocate memory, and during that we call operator new in GetAsanThreadContext which also needs to allocate memory but the second time around we don't attempt to re-enter the ASAN initialization sequence because asan_inited is now true (from asan_rtl.cc:662) but part of the initialization work has not completed yet (for example real_memset is null) so we die.
Also, can you please point me to the code which assigns a non-null value to real_memset?
lib/asan/asan_interceptors.cc:741 Beware the macro dragons!
Thanks! :)
Hmm, so InitializeAsanInterceptors is definitely executed before we set asan_inited to true. And if intercepting memset failed, we should have a "AddressSanitizer: failed to intercept memset" error message as far as I can tell, so I don't know what's going on exactly...
I couldn't grab a useful stack frame because of bug 21241
You don't need RelWithDebInfo to grab a stack trace. Naturally, I don't use that configuration yet I'm able to get stack traces :)
Why do you think you need it? I think RTL is always built with /Zi, so "Release" should be just enough.
==4716==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((__interception::real_memset)) != (0)" (0x0, 0x0) I can build small test programs fine, but this happens on Firefox.
Are you sure it's a memset problem? The reason I've asked for a log is that on large programs (like Firefox), the RTL might not be able to allocate 1/8 of the address space for the shadow memory region. In that case, however, it should print out a different error message.
Also, can you please point me to the code which assigns a non-null value to real_memset?
lib/asan/asan_interceptors.cc:741 Beware the macro dragons!
Also, can you please point me to the code which assigns a non-null value to real_memset? So far I've been unable to find it... That might give me something to make progress from. Thanks!
Didn't your VS2012 fix resolved this?
It did fix the error on a small test program, but not on Firefox. Also note that now I'm building with MSVC2013.
Or there's some other issue with similar symptoms? Please grab a stack trace and copy the /SUBSYSTEM:CONSOLE log to make sure.
I couldn't grab a useful stack frame because of bug 21241, but here is the console output again:
==4716==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((interception::real_memset)) != (0)" (0x0, 0x0) ==4716==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((interception::real_memset)) != (0)" (0x0, 0x0)
Is it firefox specific? Did you succeed in running some other app?
I can build small test programs fine, but this happens on Firefox. Haven't tried anything else. Should I, for example, try building LLVM itself with ASAN?
Didn't your VS2012 fix resolved this? Or there's some other issue with similar symptoms? Please grab a stack trace and copy the /SUBSYSTEM:CONSOLE log to make sure.
Is it firefox specific? Did you succeed in running some other app?
I'm a bit puzzled as to which bug is tracking this, but I am still getting this bug on the LLVM and Mozilla trunk...
Marking this as dup of bug 20931.
I've filed bug 20959 to track
Looks like we should revisit how the RTL behaves on early failures :( separately.
This bug has been marked as a duplicate of bug llvm/llvm-project#21305
Hello. I wanted to find out if there was any update on this.
Thanks!
Looks like we should revisit how the RTL behaves on early failures :(
On a related note: what /SUBSYSTEM: does firefox.exe use? I assume it uses :WINDOWS.
Yep.
Can you try to replace it with /SUBSYSTEM:CONSOLE to get the ASan output on stderr? This will help you diagnose this problem too.
==3252==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((interception::real_memset)) != (0)" (0x0, 0x0
)
==3252==AddressSanitizer CHECK failed: C:\moz\llvm\projects\compiler-rt\lib\asan\asan_poisoning.cc:29 "((interception::real_memset)) != (0)" (0x0, 0x0
)
This looks the exact same as bug 20931.
Looks like we should revisit how the RTL behaves on early failures :(
On a related note: what /SUBSYSTEM: does firefox.exe use? I assume it uses :WINDOWS. Can you try to replace it with /SUBSYSTEM:CONSOLE to get the ASan output on stderr? This will help you diagnose this problem too.
assigned to @timurrrr
Extended Description
We try to grab the same lock recursively, so the second time we hang with a stack like this:
This is very similar to bug 20931. This is with MSVC2013, and ninja check-asan passes on the clang-cl build from this revision.