Closed ramosian-glider closed 6 years ago
http://llvm.org/viewvc/llvm-project?rev=182456&view=rev adds a workaround and a better
test.
Full fix may require a significant surgery, so I'd like to see if a simple thing
is enough.
Reported by konstantin.s.serebryany
on 2013-05-22 09:04:27
I've got a test case that gives a false-positive error around swapcontext:
"ERROR: AddressSanitizer: SEGV on unknown address 0x000000000"
When I blacklist the file making that call, the code then prints a warning referring
to this bug:
"WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: . . .
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189"
It's from the test suite of the Charm++ parallel runtime system (http://charmplusplus.org).
If test cases for this would be useful, I'd be happy to help in understanding that
code. If you want it reduced, I can probably do some, but it's a fairly large system
with a lot of cross-dependencies.
Reported by unmobile
on 2014-01-08 20:30:01
Does asan actually report false positives after the warning about swapcontext?
A minimized test is always welcome, but we can not promise that we'll fix it --
swapcontext is a really tricky beast.
Reported by konstantin.s.serebryany
on 2014-01-09 04:50:37
Note that it generally makes little sense in blacklisting the code that performs a NULL
dereference.
Reported by ramosian.glider
on 2014-01-10 10:39:22
Here is false positive.
When you destroy a std::exception_ptr allocated from another stack without rethrowing
it, then it crashes.
GCC 4.9.2 (on Gentoo). Boost 1.56.0 compiled with C++11 support.
{{{
==26409==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fff0420b000;
bottom 0x63100000f000; size: 0x1cef041fc000 (31812891951104)
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189
=================================================================
==26409==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x6310000104a0
at pc 0x7fd9fccdcde3 bp 0x631000010320 sp 0x63100000fac8
WRITE of size 240 at 0x6310000104a0 thread T0
#0 0x7fd9fccdcde2 (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x2fde2)
#1 0x7fd9fbe8b046 in _Unwind_Resume (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libgcc_s.so.1+0x10046)
#2 0x406dc9 in my_coroutine(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&)
(/tmp/a.out+0x406dc9)
#3 0x41e7f4 in boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>,
std::__exception_ptr::exception_ptr, void (&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&),
boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>
>::run(std::__exception_ptr::exception_ptr*) (/tmp/a.out+0x41e7f4)
#4 0x41bb88 in void boost::coroutines::detail::trampoline_push<boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>,
std::__exception_ptr::exception_ptr, void (&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&),
boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>
> >(long) (/tmp/a.out+0x41bb88)
#5 0x7fd9fc89e710 in make_fcontext (/usr/lib64/libboost_context-cxx11-gcc4_9_2.so.1.56.0+0x710)
0x6310000104a0 is located 64672 bytes inside of 65536-byte region [0x631000000800,0x631000010800)
allocated by thread T0 here:
#0 0x7fd9fcd04787 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x57787)
#1 0x414890 in boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>::allocate(boost::coroutines::stack_context&,
unsigned long) (/tmp/a.out+0x414890)
#2 0x40d975 in boost::coroutines::push_coroutine<std::__exception_ptr::exception_ptr>::push_coroutine<void
(&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&)>(void
(&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&), boost::coroutines::attributes
const&) (/tmp/a.out+0x40d975)
#3 0x406ecf in main (/tmp/a.out+0x406ecf)
#4 0x7fd9fbaf8dc4 in __libc_start_main (/lib64/libc.so.6+0x24dc4)
}}}
Reported by vdavid@vizrt.com
on 2014-12-10 18:51:48
Reported by ramosian.glider
on 2015-07-30 09:05:31
Adding Project:AddressSanitizer as part of GitHub migration.
Reported by ramosian.glider
on 2015-07-30 09:06:55
I ran into this bug as well and made a test case. It's derived from the test suite in a fuzzer I'm writing. https://github.com/2trill2spill/nextgen . This was tested on Mac OSX 10.11.12, and below is the output from clang --version
.
nahs-MBP:desktop nah$ clang --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.2.0
Thread model: posix
And the test case, which was compiled with clang -fsanitize=address -o test.c test
.
#include <setjmp.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>
static jmp_buf return_jump;
static void signal_handler(int sig)
{
longjmp(return_jump, 1);
}
static void setup_test_sig_handler(void)
{
struct sigaction sa;
sigset_t ss;
unsigned int i;
for(i = 1; i < 512; i++)
{
(void)sigfillset(&ss);
sa.sa_flags = SA_RESTART;
sa.sa_handler = signal_handler;
sa.sa_mask = ss;
(void)sigaction((int)i, &sa, NULL);
}
return;
}
int main(void)
{
int rtrn = setjmp(return_jump);
if(rtrn < 0)
{
perror("setjmp");
return (-1);
}
setup_test_sig_handler();
/* Cause signal. */
memmove(NULL, "123456789", 9);
return (0);
}
2trill2spill, why is the previous comment related to this bug? The code does not even have swapcontext call. Please open a separate bug explaining what exactly went wrong.
The error message I get from running the test case points to this page, so I assumed that It was the same issue.
ASAN:SIGSEGV
=================================================================
==8425==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000105ab39e2 bp 0x7fff5a1ab9a0 sp 0x7fff5a1ab128 T0)
#0 0x105ab39e1 in __sanitizer::internal_memmove(void*, void const*, unsigned long) (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x569e1)
#1 0x105a54a9e in main (/Users/nah/Desktop/./test+0x100000a9e)
#2 0x7fff95a7c5ac in start (/usr/lib/system/libdyld.dylib+0x35ac)
#3 0x0 (<unknown module>)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 __sanitizer::internal_memmove(void*, void const*, unsigned long)
==8425==ABORTING
==8425==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fff5a1ac000; bottom 0x000106b5d000; size: 0x7ffe5364f000 (140730297544704)
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189
ASAN:SIGSEGV
==8425==AddressSanitizer: while reporting a bug found another one. Ignoring.
`
I don't see such message on Linux, so it might be a OSX-specific issue, unrelated to swapcontext. Please file a separate bug, will discuss it there.
Can anyone explain how https://github.com/facebook/folly/commit/2ea64dd24946cbc9f3f4ac3f6c6b98a486c56e73 works? I could not find anything like "__asan_enter_fiber"?
Is there something like VALGRIND_STACK_REGISTER
/ VALGRIND_STACK_DEREGISTER
available for address sanitizer?
Since there is nothing on the Internet about those functions, I guess that facebook has a fork of clang where they have implemented them.
I tried to implement the functions myself here, but I have very little knowledge of how things work, feedback is welcome. Note that the function prototypes are a little different from those used by folly. I tested this code with an implementation of coroutines on top of boost context v2, the warning about handle_no_return has indeed disappeared and it seems to work.
Thank you! I will try your modified version as soon as possible. You just include the "asan_interface_internal.h" header in your binary or do you use the approach from https://github.com/facebook/folly/commit/2ea64dd24946cbc9f3f4ac3f6c6b98a486c56e73 (dlsym)?
@pdziepak
I only did some tests for the moment where I just declared the functions in my project:
extern "C"
{
void __asan_enter_fiber(void const* stack_top, void const* stack_bottom);
void __asan_exit_fiber();
}
Of course this fails to link if I don't compile with asan. I think folly's approach (with the weak symbol attributes and the fallback on dlsym) should work too.
FYI my patch finally passed review (a lot of things were fixed in the meantime), http://llvm.org/viewvc/llvm-project?view=revision&revision=273260 . Now let's wait for clang 3.9 :)
That's great news! Thank you very much for your effort! :+1:
I just ran into this issue. I see the solution is to notify asan if switching stacks? I'm implementing coroutines.
@ioquatix on gcc? Whoa!
I think gcc 7 and latest clang have better support for makecontext and friends.
Yes, the idea is to annotate your code to notify asan when you switch context.
You can find some documentation here https://github.com/llvm-mirror/compiler-rt/blob/master/include/sanitizer/common_interface_defs.h#L166 and an example in the tests https://github.com/llvm-mirror/compiler-rt/blob/master/test/asan/TestCases/Linux/swapcontext_annotation.cc .
Since the test is still there, I don't think swapcontext has got any more support for asan.
Cool- I'm not using makecontext/swapcontext but using ASM to switch stacks directly. I'll try out the annotations.
Okay, so I've tried to implement this and it appars to be compiling, but I'm having some issues.
First, the changes I made:
Fiber::resume
which swaps from main stack to fiber stack (and potentially nested fibers):
Fiber::yield
which swaps from fiber stack back to main stack (or potentially parent stack if nested):
cocall
which is the first function executed on the stack:
So, the order is always balanced, e.g. call resume, start stack, then in cocall, finish stack, then in yield, start stack, back to resume exit stack, finish.
Is this a reasonable implementation?
I wasn't entirely sure what I should be doing with all the stack pointers/sizes, I guess that start stack should be details of the stack you are transferring to, and finish stack should be the details of the stack you came from. However, what is the purpose of fake stack and how should I handle it given that fibers can transfer in a non-nested way?
Finally, even thought this seems to work, I now get a error:
--- Concurrent::Fiber ---
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it should resume] 1 passed out of 1 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (yield)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it should yield] 2 passed out of 2 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
==14488==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x000000000000; bottom 0x7ffff98be000; size: 0xffff800006742000 (-140737380081664)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
[it should throw exceptions] 1 passed out of 1 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (yield)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it can be stopped] 4 passed out of 4 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
=================================================================
==14488==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fa0ccbfebc8 at pc 0x55baaadf659f bp 0x7fa0ccbfea20 sp 0x7fa0ccbfea18
WRITE of size 8 at 0x7fa0ccbfebc8 thread T0
#0 0x55baaadf659e in std::exception_ptr::exception_ptr() /usr/bin/../include/c++/v1/exception:143:59
#1 0x55baaadf659e in Concurrent::Fiber::Fiber<Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}>(Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}&&, unsigned long) include/Concurrent/Fiber.hpp:53
#2 0x55baaadf5941 in Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const concurrent/test/Concurrent/Test.Fiber.cpp:112:12
#3 0x55baaadf43d3 in Concurrent::Coentry<Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) include/Concurrent/Fiber.hpp:169:4
#4 0x55baaae94396 in coro_init concurrent/source/Concurrent/coro.c:97:3
#5 0x7fa0cf71ed3f (/usr/lib/libc.so.6+0x35d3f)
Address 0x7fa0ccbfebc8 is located in stack of thread T0 at offset 104 in frame
#0 0x55baaadf55bf in Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const concurrent/test/Concurrent/Test.Fiber.cpp:109
This frame has 2 object(s):
[32, 144) 'inner' <== Memory access at offset 104 is inside this variable
[176, 184) 'ref.tmp'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /usr/bin/../include/c++/v1/exception:143:59 in std::exception_ptr::exception_ptr()
Shadow bytes around the buggy address:
0x0ff499977d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff499977d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff499977d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff499977d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff499977d60: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
=>0x0ff499977d70: 00 00 00 00 00 00 00 00 00[f3]f3 f3 00 00 f2 f2
0x0ff499977d80: f2 f2 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
0x0ff499977d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff499977da0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
0x0ff499977db0: 00 f2 f2 f2 00 f2 f2 f2 00 f3 f3 f3 00 00 00 00
0x0ff499977dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==14488==ABORTING
It appears as if exception_ptr of an exception thrown on another stack is not working? However, that stack is not deallocated yet, until after the exception is handled, AFAIK. I will review further but just wondering if anyone can give me feedback on my implementation.
Okay, I changed all fake_stack to nullptr
and I no longer get any error, but I still get warning
==14488==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x000000000000; bottom 0x7ffff98be000; size: 0xffff800006742000 (-140737380081664)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
So, I guess I'm doing something a bit wrong. I'll read documentation a bit more. Any ideas appreciated.
Okay, so after reading the documentation and playing around a bit, I tried implementing it as so:
#if defined(VARIANT_SANITIZE)
void * fake_stack = nullptr;
__sanitizer_start_switch_fiber(&fake_stack, _stack.base(), _stack.allocated_size());
std::cerr << "__sanitizer_start_switch_fiber (resume, fake_stack=" << fake_stack << ")" << std::endl;
#endif
coro_transfer(&_caller->_context, &_context);
#if defined(VARIANT_SANITIZE)
std::cerr << "__sanitizer_finish_switch_fiber (resume, fake_stack=" << fake_stack << ")" << std::endl;
__sanitizer_finish_switch_fiber(fake_stack, nullptr, nullptr);
#endif
The first (odd?) thing is that the pointer is always null? Is it using the address of the pointer to mark the stack somehow?
Secondly, I followed the instructions regarding the last __sanitizer_start_switch_fiber having nullptr as the first argument. It seems to work as expected. Still, I'm not completely confident I understand how it should be used. The test code is a bit messy, it would be nice to have a simple annotated example, not a test designed to stress test asan.
Finally, I still couldn't get the warning to go away.
The first argument (the fake stack save) is only used when the fake stack is enabled.
To enable it, define the variable ASAN_OPTIONS=detect_stack_use_after_return=1 .
It should be unrelated to the warning you get. You get this warning when you used the API wrong, or when there is a bug in ASAN...
I am not sure (since things have changed since my patch), but I think you need to put the stack of the coroutine you come from, here https://github.com/kurocha/concurrent/blob/6315ca4da220bdffec8fd292a04150a9eacea41d/source/Concurrent/Fiber.cpp#L74
EDIT: same for yield
Thanks for the useful information, I will try it out and report back.
Okay I managed to get it to work with no warnings.
Conceptually, I had to break the functions into 4 variations.
Once I did this, I could understand and reason about how they should be called from resume
, yield
, transfer
and so on. It makes sense now.
So, I got the core code working without warnings/errors, but found some issue here:
/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests
--- Async::Protocol::Buffer ---
[it can append data] 2 passed out of 2 total
[it can read data from a file] 1 passed out of 1 total
[it can read data into non-contiguous buffer] 12 passed out of 12 total
[it can read data from a file in chunks] 3 passed out of 3 total
--- Async::Notification ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can notify the fiber to continue] 1 passed out of 1 total
--- Async::Writable ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f78442fb000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f78442fb000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can wait for writing] 1 passed out of 1 total
--- Async::Readable ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f78442fb000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f78442fb000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can wait for reading] 2 passed out of 2 total
--- Async::Job ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
=================================================================
==13117==AddressSanitizer CHECK failed: /build/llvm/src/llvm-4.0.1.src/projects/compiler-rt/lib/asan/asan_thread.cc:320 "((ptr[0] == kCurrentStackFrameMagic)) != (0)" (0x0, 0x0)
#0 0x55e6f67cb527 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0527)
#1 0x55e6f67e7655 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1ec655)
#2 0x55e6f67d076a in __asan::AsanThread::GetStackFrameAccessByAddr(unsigned long, __asan::AsanThread::StackFrameAccess*) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d576a)
#3 0x55e6f6718148 in __asan::AddressDescription::AddressDescription(unsigned long, unsigned long, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11d148)
#4 0x55e6f671a8e0 in __asan::ErrorGeneric::ErrorGeneric(unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11f8e0)
#5 0x55e6f67cac9e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1cfc9e)
#6 0x55e6f67cbe5b in __asan_report_store8 (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0e5b)
#7 0x55e6f688df37 in std::__1::function<void ()>::function<Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}, void>(Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}) /usr/bin/../include/c++/v1/functional:1763:7
#8 0x55e6f688d8b3 in Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const /home/samuel/Documents/kurocha/async/test/Async/Test.Job.cpp:32:23
#9 0x55e6f688c2ff in Concurrent::Coentry<Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/Concurrent/Fiber.hpp:175:4
#10 0x55e6f69605d6 in coro_init /home/samuel/Documents/kurocha/async/teapot/packages/development/concurrent/source/Concurrent/coro.c:97:3
#11 0x7f7847da4d3f (/usr/lib/libc.so.6+0x35d3f)
Task #<TaskClassForAsyncTests_47339649637900:0x00561c3e102e20> failed: "Async-tests" exited with status 256
Task #<TaskClassForAsyncTests_47339649637900:0x00561c3e1c0c90> failed: Children tasks failed!
Task #<TaskClassForAsyncTests_47339649637900:0x00561c3f58c058> failed: Children tasks failed!
It's also.. a little bit odd.. in that if I only run that test, it fails a bit later:
/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests Async::Job
--- Async::Job ---
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffdf69a5000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffdf69a5000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffdf69a5000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffdf69a5000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7fb119ef6000, 4202496)
[it can wait for result] 1 passed out of 1 total
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffdf69a5000, 8388608)
=================================================================
==13215==AddressSanitizer CHECK failed: /build/llvm/src/llvm-4.0.1.src/projects/compiler-rt/lib/asan/asan_thread.cc:320 "((ptr[0] == kCurrentStackFrameMagic)) != (0)" (0x0, 0x0)
#0 0x55c23ae6d527 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0527)
#1 0x55c23ae89655 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1ec655)
#2 0x55c23ae7276a in __asan::AsanThread::GetStackFrameAccessByAddr(unsigned long, __asan::AsanThread::StackFrameAccess*) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d576a)
#3 0x55c23adba148 in __asan::AddressDescription::AddressDescription(unsigned long, unsigned long, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11d148)
#4 0x55c23adbc8e0 in __asan::ErrorGeneric::ErrorGeneric(unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11f8e0)
#5 0x55c23ae6cc9e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1cfc9e)
#6 0x55c23ae6dcab in __asan_report_store1 (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0cab)
#7 0x55c23af4bb06 in UnitTest::Expectation<UnitTest::Examiner, Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}>::Expectation(UnitTest::Examiner&, {lambda()#1} const&, bool) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/UnitTest/Expectation.hpp:22:120
#8 0x55c23af43ea4 in UnitTest::Expectation<UnitTest::Examiner, Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}> UnitTest::Examiner::expect<Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}>(Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2} const&) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/UnitTest/UnitTest.hpp:53:11
#9 0x55c23af428cd in Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const /home/samuel/Documents/kurocha/async/test/Async/Test.Job.cpp:63:15
#10 0x55c23af4102f in Concurrent::Coentry<Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/Concurrent/Fiber.hpp:175:4
#11 0x55c23b0025d6 in coro_init /home/samuel/Documents/kurocha/async/teapot/packages/development/concurrent/source/Concurrent/coro.c:97:3
#12 0x7fb120e3ed3f (/usr/lib/libc.so.6+0x35d3f)
Task #<TaskClassForAsyncTests_47187663006060:0x0055d5794d4ff0> failed: "Async-tests" exited with status 256
Task #<TaskClassForAsyncTests_47187663006060:0x0055d5792c1ab0> failed: Children tasks failed!
Task #<TaskClassForAsyncTests_47187663006060:0x0055d5792c2dc0> failed: Children tasks failed!
These tests check that a fiber adding a job to a thread pool works as expected. The tests pass without sanity checks.
The only thing I can think of, is that between tests sometimes stacks are allocated at the same address. Perhaps there is something left over from a previous invocation that's causing it to fail?
So, I checked, and individually the tests work fine.
Okay, I updated from clang 4.x to 5.x and the problem is... gone.
@kcc: Seems that we have a workaround now with fiber annotations. Can we close this?
let's close. If anyone sees a remaining problem, please open a new bug with new details.
Hi guys.
Should the custom swapcontext() be somehow annotated to asan? I've got asan working by using the glibc's swapcontext() and __sanitizer_start_switch_fiber __sanitizer_finish_switch_fiber annotations. But when using the custom, asm-written swapcontext()-alike function, I can't get things to work even with the same switch_fiber annotations. So should I somehow also annotate the custom swapcontext function?
It crashes in a function epilogue that looks like this:
0x00005555560f49af <+1191>: je 0x5555560f49d2 <co_switch_context+1226>
0x00005555560f49b1 <+1193>: movq $0x45e0360e,(%rbx)
0x00005555560f49b8 <+1200>: movabs $0xf5f5f5f5f5f5f5f5,%rax
0x00005555560f49c2 <+1210>: mov %rax,0x7fff8000(%r14)
0x00005555560f49c9 <+1217>: mov 0x38(%rbx),%rax
=> 0x00005555560f49cd <+1221>: movb $0x0,(%rax)
rax==0 here. I don't understand what does this epilog code do and why it crashes only with the custom swapcontext().
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_interceptors.cpp#L243 Obviously asan intercepts swapcontext() and another *context functions. So seems like there is no way to use the custom swapcontext() with asan?
Just for reference, this is how I implemented it: https://github.com/kurocha/concurrent/blob/6eee988ba7263f017a8d74560afde2f0396c1370/source/Concurrent/Fiber.cpp#L46-L70
Asan support is working for us like this: https://github.com/motis-project/ctx/blob/master/include/ctx/impl/operation.h#L51-L86
We're using this in combination with deboost.context.
Thanks, deboost.context indeed looks like using its own asm for context switching, and yet it works for you with asan with just a basic *_switch_fiber() annotations... Interesting.
As for "concurrent" project mentioned by @ioquatix - I can't find the custom context switching primitives there.
https://github.com/septag/deboost.context/blob/master/asm/jump_x86_64_ms_pe_gas.asm#L164
movq %gs:(0x30), %r10
/* restore fiber local storage */
movq 0xb0(%rsp), %rax
movq %rax, 0x20(%r10)
/* restore current deallocation stack */
movq 0xb8(%rsp), %rax
movq %rax, 0x1478(%r10)
/* restore current stack limit */
movq 0xc0(%rsp), %rax
movq %rax, 0x10(%r10)
/* restore current stack base */
movq 0xc8(%rsp), %rax
movq %rax, 0x08(%r10)
@felixguendling - is this code snip written specifically for asan? Or some other purpose?
https://github.com/orgs/kurocha/repositories?q=coroutine&type=all&language=&sort= for all native implementations.
Thanks! Its very simplistic: https://github.com/kurocha/coroutine-amd64/blob/master/source/Coroutine/Context.s Just pushes a few regs on stack. And yet it works with asan... Then perhaps I need to find a problem in the context-switching code I took from libtask...
Coroutine transfer is a simple operation, it's a function call and return with a stack swap in the middle. Any implementation that makes it more complicated than that is wrong. IMHO :)
You are right. :) And still some guys (like myself) can shoot their feet even here. I had a Cish wrapper around asm getcontext, and it wasn't marked with always_inline attribute. As the result, it was saving its own stack frame to the context struct... Your example, being that simple, helped me to realize the stupidity. I wonder why it never broke w/o asan...
Hi everyone, @ioquatix I met an issue when I tried to enable Asan in the c program using swapcontext function, are you free to provide some advice?
The following are the details: Build server: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
With the help of _sanitizer_start_switch_fiber/sanitizer_finish_switchfiber, even if I turned on the Asan, my program worked fine in jumpping from one to another coroutine(like main thread to coroutine func2, coroutine func2 to coroutine func1), but as soon as I try to restore the old coroutine by jumping back to it, Asan stops working(stack-buffer-overflow can not be detected).
And I noticed that the argument _func2_threadstack below got nothing all the time:
void *func2_thread_stack = NULL;
__sanitizer_start_switch_fiber(&func2_thread_stack,uctx_func1.uc_stack.ss_sp,uctx_func1.uc_stack.ss_size);
……… (swap from func2 to func1, and then func1 swap back func2)
__sanitizer_finish_switch_fiber(func2_thread_stack, &from_stack, &from_stacksize);
Could you give me some advice?
Sincerely,
I need to share my experience with boost-asio and asan briefly for all other fellows that are me three days ago.
If this is your problem: "You need to make asan and boost-asio with coroutines get along" then your are in for a treat.
I put some pieces together but it is essentially no-brainer, boost-asio does not have the incentives to move to coroutine2 https://github.com/chriskohlhoff/asio/issues/603 but thanks to https://github.com/cbodley/spawn (a stand-
alone header-only library of the latter PR) it can work.
After switching all boost::asio::spawn
to spawn::spawn
you are through the tedious parts.
Obviously you need to build boost with asan support as mentioned here https://github.com/boostorg/coroutine/issues/30#issuecomment-325583085
context-impl=ucontext -DBOOST_USE_ASAN -DBOOST_USE_UCONTEXT
And simplest part build your project with -DBOOST_USE_ASAN DBOOST_USE_UCONTEXT
flags.
voila
Thanks to @ioquatix , the compatible issue between ASAN and swapcontext is solved in my program. He was very friendly and very patient, and he taught me a lot about asan. I have compiled some of the lessons he taught me so that more people like me can learn from it:
Compatible issue between ASAN and swapcontext() 1.Phenomenon ASAN does not fully support swapcontext technology, as asan has indicated in log: ==1000==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
Under this constraint, if function swapcontext() is introduced in your program, there will be some false positives reported after coroutine was changed. The detection capability of ASAN is almost ineffective, and even seriously affects the normal operation of the program.
2.Mechanism of asan To solve this problem, we need to understand why these false positives occur.
And to understand why these false positives occur, we need to learn how asan works: ASAN needs to allocate and store a shadow stack for each fiber, to track usage. You should also poison the stack when it’s no longer in use (e.g. if you track a high water mark, or completely free it).
3.Way to make swapcontext() compatible with asan Note: The flag 'ASAN_OPTIONS=detect_stack_use_after_return=true' is necessary when the swapcontext() function is used on your program.
Therefore, we need to find a way to notify ASAN before/after we exchange the fiber.
To make things easier, I recommend adding fake_stack
pointer for every fiber when ASAN is enabled.
For this fake_stack:
And when we try to jump to new(target) coroutine by executing swapcontext(), we need to store the fake_stack of old(current) fiber, so that when we try to return to the old fiber, we can restore the stack of old fiber with the fake_stack we ever stored before.
Here introduce two function provided by ASAN to manage the fake_stack:
// Fiber annotation interface.
// Before switching to a different stack, one must call
// __sanitizer_start_switch_fiber with a pointer to the bottom of the
// destination stack and its size. When code starts running on the new stack,
// it must call __sanitizer_finish_switch_fiber to finalize the switch.
// The start_switch function takes a void** to store the current fake stack if
// there is one (it is needed when detect_stack_use_after_return is enabled).
// When restoring a stack, this pointer must be given to the finish_switch
// function. In most cases, this void* can be stored on the stack just before
// switching. When leaving a fiber definitely, null must be passed as first
// argument to the start_switch function so that the fake stack is destroyed.
// If you do not want support for stack use-after-return detection, you can
// always pass null to these two functions.
// Note that the fake stack mechanism is disabled during fiber switch, so if a
// signal callback runs during the switch, it will not benefit from the stack
// use-after-return detection.
void __sanitizer_start_switch_fiber(void **fake_stack_save,
const void *bottom, size_t size);
void __sanitizer_finish_switch_fiber(void *fake_stack_save,
const void **bottom_old,
size_t *size_old);
The implementation of these two function is in here: https://github.com/llvm/llvm-project/blob/a2ef44a5d65932c7bb0f483217826856325b60df/compiler-rt/lib/asan/asan_thread.cpp#L526-L551
From the source code, we can see that, __sanitizer_start_switch_fiber will assign the fake_stack IF and ONLY IF you provide a pointer.
This is how I handle swapcontext() issue:
//vthctx: Context of main fiber/coroutine
//vth: Context of fiber/coroutine A
Step1: Try to exchange from main fiber to fiber A =========================================================================:
//On the main fiber.
//Argument0: The container for asan to allocate the fake_stack for current fiber.
// - If we want the current fiber to stay still(we are going to jump back later),then one valid pointer(&vthctx->fake_stack here) shall be passed to argument 0 to store the fake_stack of current fiber;
// - If we don't want to keep the current fiber alive(we won't jump back), 'NULL' shall be passed to argument 0 to notify asan to delete the fake_stack of current fiber.
//Argument1: The info of target fiber we are going to jump to.
//Argument2: The info of target fiber we are going to jump to.
__sanitizer_start_switch_fiber(&vthctx->fake_stack, vth->uctx.uc_stack.ss_sp, vth->uctx.uc_stack.ss_size);
//exchange to target fiber A.
swapcontext(&vthctx->tmp_outer_uctx, &vth->uctx);
Step2: On the trigger function of fiber A =========================================================================:
//On the fiber A
const void *from_stack;
size_t from_stacksize;
//Argument0: We are the first time to jump into this fiber, so NULL shall be set as argument 0;
// - Set argument 0 to 'NULL' means that we have no historical stack to restore for this fiber;
// - If we have been to this fiber and have historical stack to restore for this fiber, then set the historical stack to argument 0.
//Argument1: The container for asan to return the info of old fiber we were in before we jumped over.
//Argument2: The container for asan to return the info of old fiber we were in before we jumped over.
__sanitizer_finish_switch_fiber(NULL, &from_stack, &from_stacksize);
Step3: jump back from fiber A to main fiber=========================================================================:
//Argument0: To store the fake_stack of old fiber before jumping out.
// - Pass 'NULL' to argument 0 if we won't jump back to fiber A, then asan will delete the fake_stack of fiber A for us.
// - Pass '&vth->fake_stack'to argument 0 if we plan to keep fiber A alive and we will jump back in the future,and asan will keep the fake_stack of fiber A for us.
//Argument1: The info of target fiber we are going to jump to.
//Argument2: The info of target fiber we are going to jump to.
__sanitizer_start_switch_fiber(NULL, vthctx->tmp_outer_uctx.uc_stack.ss_sp, vthctx->tmp_outer_uctx.uc_stack.ss_size);
//exchange to main fiber.
swapcontext(&vth->uctx, &vthctx->tmp_outer_uctx);
Step4: Restore the fake_stack on main fiber =========================================================================:
//At the point of the main fiber we're jumping back to
//Argument0: The fake_stack sotred before(see Step1),also the one we try to restore for this fiber.
//Argument1: The container for asan to return the info of old fiber we were in before we jumped over.
//Argument2: The container for asan to return the info of old fiber we were in before we jumped over.
__sanitizer_finish_switch_fiber(vthctx->fake_stack, &from_stack, &from_stacksize);
ASAN only cares about tracking the stack swapping, so as long as you wrap the stack exchange operation (coroutine transfer) correctly, ASAN should work well with swapcontext().
I wonder if the above comment can be added to the Wiki?
We also encountered the problem: libasan hangs in pthread_create() and never returns (it sometimes hangs, but not always).
Stack trace with symbols:
Thread 2 (LWP 1236467):
#0 __sanitizer::atomic_exchange<__sanitizer::atomic_uint32_t> (mo=__sanitizer::memory_order_acquire, v=2, a=0x640000001b00)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h:61
#1 __sanitizer::BlockingMutex::Lock (this=this@entry=0x640000001b00) at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_linux.cc:618
#2 0x00007f0c9958f37d in __sanitizer::GenericScopedLock<__sanitizer::BlockingMutex>::GenericScopedLock (mu=0x640000001b00, this=<synthetic pointer>)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_mutex.h:183
#3 __sanitizer::SizeClassAllocator64<__asan::AP64>::GetFromAllocator (this=this@entry=0x7f0c996c7e40 <__asan::instance>, stat=stat@entry=0x7f0c7c67bc40, class_id=class_id@entry=36,
chunks=chunks@entry=0x7f0c7c677330, n_chunks=n_chunks@entry=8)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_allocator_primary64.h:126
#4 0x00007f0c9958f4ac in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64> >::Refill (this=this@entry=0x7f0c7c66e0e0, c=c@entry=0x7f0c7c677320,
allocator=allocator@entry=0x7f0c996c7e40 <__asan::instance>, class_id=class_id@entry=36)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_allocator_local_cache.h:105
#5 0x00007f0c99593a78 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64> >::Allocate (class_id=36, allocator=0x7f0c996c7e40 <__asan::instance>,
this=0x7f0c7c66e0e0) at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_common.h:439
#6 __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__asan::AP64>, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64> >, __sanitizer::LargeMmapAllocator<__asan::AsanMapUnmapCallback, __sanitizer::ReturnNullOrDieOnFailure> >::Allocate (alignment=1, size=8192, cache=0x7f0c7c66e0e0, this=0x7f0c996c7e40 <__asan::instance>)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_allocator_combined.h:60
#7 __asan::QuarantineCallback::Allocate (size=8192, this=<synthetic pointer>) at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_allocator.cc:163
#8 __sanitizer::QuarantineCache<__asan::QuarantineCallback>::Enqueue (size=32, ptr=0x60300013a7f0, cb=..., this=0x7f0c7c66e060)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_quarantine.h:212
#9 __sanitizer::Quarantine<__asan::QuarantineCallback, __asan::AsanChunk>::Put (size=32, ptr=0x60300013a7f0, cb=..., c=0x7f0c7c66e060, this=0x7f0c998c80b8 <__asan::instance+2097784>)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_quarantine.h:102
#10 __asan::Allocator::QuarantineChunk (stack=0x60300013a800, ptr=0x60300013a800, m=0x60300013a7f0, this=0x7f0c996c7e40 <__asan::instance>)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_allocator.cc:564
#11 __asan::Allocator::Deallocate (this=this@entry=0x7f0c996c7e40 <__asan::instance>, ptr=ptr@entry=0x60300013a800, delete_size=delete_size@entry=0, stack=stack@entry=0x7f0c7ce83c70,
alloc_type=alloc_type@entry=__asan::FROM_MALLOC) at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_allocator.cc:609
#12 0x00007f0c9958e657 in __asan::asan_free (ptr=ptr@entry=0x60300013a800, stack=stack@entry=0x7f0c7ce83c70, alloc_type=alloc_type@entry=__asan::FROM_MALLOC)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_allocator.cc:803
#13 0x00007f0c9964ffdb in __interceptor_free (ptr=0x60300013a800) at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_malloc_linux.cc:69
#14 0x00007f0c98d254bd in __pthread_attr_destroy (attr=attr@entry=0x7f0c7ce84510) at pthread_attr_destroy.c:38
#15 0x00007f0c9966c892 in __sanitizer::GetThreadStackTopAndBottom (at_initialization=at_initialization@entry=false, stack_top=stack_top@entry=0x7f0c7ce845a0,
stack_bottom=stack_bottom@entry=0x7f0c7ce845a8) at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc:110
#16 0x00007f0c9966cbf3 in __sanitizer::GetThreadStackAndTls (main=<optimized out>, stk_addr=stk_addr@entry=0x7f0c7c66e020, stk_size=stk_size@entry=0x7f0c7ce845f8,
tls_addr=tls_addr@entry=0x7f0c7c66e040, tls_size=tls_size@entry=0x7f0c7ce845f0)
at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc:415
#17 0x00007f0c9965e7bf in __asan::AsanThread::SetThreadStackAndTls (this=this@entry=0x7f0c7c66e000, options=<optimized out>)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_thread.h:80
#18 0x00007f0c9965ea31 in __asan::AsanThread::Init (this=this@entry=0x7f0c7c66e000, options=options@entry=0x0)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_thread.cc:224
#19 0x00007f0c9965ee34 in __asan::AsanThread::ThreadStart (this=0x7f0c7c66e000, os_id=1236467, signal_thread_is_registered=0x7f0c830bfda8)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_thread.cc:241
#20 0x00007f0c98d23c79 in start_thread (arg=0x7f0c7ce8a700) at pthread_create.c:486
#21 0x00007f0c986d7a4f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (LWP 1236463):
#0 __sanitizer::internal_sched_yield () at /gcc8_x86_64/src/gcc/libsanitizer/sanitizer_common/sanitizer_syscall_linux_x86_64.inc:18
#1 0x00007f0c995b7f45 in __interceptor_pthread_create (thread=thread@entry=0x60f0000000e0, attr=<optimized out>, attr@entry=0x0,
start_routine=start_routine@entry=0x3502c00 <bvar::detail::SamplerCollector::sampling_thread(void*)>, arg=arg@entry=0x60f000000040)
at /gcc8_x86_64/src/gcc/libsanitizer/asan/asan_interceptors.cc:242
#2 0x00000000035021d3 in bvar::detail::SamplerCollector::create_sampling_thread (this=0x60f000000040)
at code/third_party/submodule/brpc/src/bvar/detail/sampler.cpp:104
#3 bvar::detail::SamplerCollector::after_forked_as_child (this=0x60f000000040) at code/third_party/submodule/brpc/src/bvar/detail/sampler.cpp:104
#4 bvar::detail::SamplerCollector::child_callback_atfork () at code/third_party/submodule/brpc/src/bvar/detail/sampler.cpp:86
#5 0x00007f0c986e4cf8 in __run_fork_handlers (who=who@entry=atfork_run_child) at register-atfork.c:134
#6 0x00007f0c986a672d in __libc_fork () at ../sysdeps/nptl/fork.c:137
#7 0x00007f0c98652824 in _IO_new_proc_open (fp=fp@entry=0x6110000c1100, command=command@entry=0x603000131380 "/usr/bin/chronyc tracking 2>&1", mode=<optimized out>,
mode@entry=0x4414e60 "r") at iopopen.c:122
#8 0x00007f0c98652ac8 in _IO_new_popen (command=0x603000131380 "/usr/bin/chronyc tracking 2>&1", mode=0x4414e60 "r") at iopopen.c:203
#9 0x0000000002e24af2 in bytebase::common::CommandRunner::Exec (command=...) at code/src/common/command_runner.cc:13
#18 0x0000000002d5aa2f in make_pcontext ()
#19 0x0000000000000000 in ?? ()
Looks similar to this issue: https://github.com/google/sanitizers/issues/945
It seems like these days detect_stack_use_after_return
breaks fiber switching.
When detect_stack_use_after_return
is
enabled, asan malloc's the "fake" stack and
puts the locals there together with redzones.
That process is (partially) documented here:
https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterReturn
That very same stack ptr is put into the first
argument of __sanitizer_start_switch_fiber()
.
If detect_stack_use_after_return
is disabled,
then the "fake stack" machinery is not used,
so __sanitizer_start_switch_fiber()
always puts
NULL into its first arg.
Now the problem is, the fake-stack is per-thread,
not per-fiber. When some fiber exits, we put
NULL into the first arg of __sanitizer_start_switch_fiber()
,
and that unmaps the entire per-thread fake-stack:
https://gnu.googlesource.com/gcc/+/refs/heads/trunk/libsanitizer/asan/asan_thread.cpp#166
Which, as noted above, contains current locals
and redzones. So all crashes.
Probably __sanitizer_start_switch_fiber()
should
allocate and free its own fake stacks, and not touch
the per-thread one?
I opened #1760 for that problem.
Originally reported on Google Code with ID 189
Reported by
konstantin.s.serebryany
on 2013-05-22 07:40:59