apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.5k stars 1.31k forks source link

fdb_c_unit_test failed on ARM #6753

Closed sfc-gh-xwang closed 2 years ago

sfc-gh-xwang commented 2 years ago

commit: 26e34a947f9ab85c53c4b3be43d0564b6fbf182b compiler: clang13 ERROR Info:

../bindings/c/test/unit/unit_tests.cpp:2401: FATAL ERROR: test case CRASHED: SIGSEGV - Segmentation violation signal

@sfc-gh-anoyes has investigated this bug, ~the thing is that malloc is returning null which results in the SEGMENT FAULT~ [1]. According to GDB it's segfaulting on this instruction:

=> 0x0000ffff952038a8 <+76>:    ldr     x9, [x20, #24]

[1]: sfc-gh-anoyes: I was mistaken about this part actually

sfc-gh-anoyes commented 2 years ago
#0  estimatedTotalSize (this=<optimized out>) at /home/anoyes/foundationdb/flow/Arena.cpp:205
#1  makeReference (this=<optimized out>, next=<optimized out>) at /home/anoyes/foundationdb/flow/Arena.cpp:254
#2  ArenaBlock::create(int, Reference<ArenaBlock>&) (dataSize=<optimized out>, dataSize@entry=8, next=...) at /home/anoyes/foundationdb/flow/Arena.cpp:398
#3  0x0000ffff9b59935c in ArenaBlock::allocate(Reference<ArenaBlock>&, int) (self=..., bytes=8) at /home/anoyes/foundationdb/flow/Arena.cpp:293
#4  0x0000ffff9b3ae344 in operator new[] (size=18446462597048025212, p=...) at /home/anoyes/foundationdb/flow/Arena.h:202
#5  StringRef (p=..., toCopy=..., this=<optimized out>) at /home/anoyes/foundationdb/flow/Arena.h:440
#6  ReadYourWritesTransaction::set(StringRef const&, StringRef const&) (this=0xffff8c0008c0, key=..., value=...) at /home/anoyes/foundationdb/fdbclient/ReadYourWrites.actor.cpp:2228
#7  0x0000ffff9b4a3bd4 in operator() (this=0xffff99421528) at /home/anoyes/foundationdb/fdbclient/ThreadSafeTransaction.cpp:377
#8  a_body1cont1 (this=0xffff99421520, loopDepth=0, _=...) at /home/anoyes/foundationdb/flow/ThreadHelper.actor.h:47
#9  internal_thread_helper::DoOnMainThreadVoidActorState<ThreadSafeTransaction::set(StringRef const&, StringRef const&)::$_27, internal_thread_helper::DoOnMainThreadVoidActor<ThreadSafeTransaction::set(StringRef const&, StringRef const&)::$_27> >::a_body1when1(Void const&, int) (this=this@entry=0xffff99421520, _=..., loopDepth=0) at flow/ThreadHelper.actor.g.h:150
#10 0x0000ffff9b4a3928 in a_callback_fire (this=0xffff99421520, value=...) at flow/ThreadHelper.actor.g.h:171
#11 ActorCallback<internal_thread_helper::DoOnMainThreadVoidActor<ThreadSafeTransaction::set(StringRef const&, StringRef const&)::$_27>, 0, Void>::fire(Void const&) (this=0xffff99421500, value=...) at /home/anoyes/foundationdb/flow/flow.h:1318
#12 0x0000ffff9ae0febc in SAV<Void>::send<Void>(Void&&) (this=0xffff994045c0, value=<optimized out>) at /home/anoyes/foundationdb/flow/flow.h:660
#13 0x0000ffff9b5e5bac in send<Void> (this=0xffff994061c8, value=<unknown type in /home/anoyes/build/lib/libfdb_c.so, CU 0x30f877e, DIE 0x31efd40>) at /home/anoyes/foundationdb/flow/flow.h:906
#14 N2::PromiseTask::operator()() (this=0xffff994061c0) at /home/anoyes/foundationdb/flow/Net2.actor.cpp:1190
#15 0x0000ffff9b5d275c in N2::Net2::run() (this=0x200cb40) at /home/anoyes/foundationdb/flow/Net2.actor.cpp:1533
#16 0x0000ffff9aec7990 in runNetwork() () at /home/anoyes/foundationdb/fdbclient/NativeAPI.actor.cpp:2447
#17 0x0000ffff9b49426c in ThreadSafeApi::runNetwork() (this=0x1ffcd80) at /home/anoyes/foundationdb/fdbclient/ThreadSafeTransaction.cpp:531
#18 0x0000ffff9ae2464c in MultiVersionApi::runNetwork() (this=0x1fdb060) at /home/anoyes/foundationdb/fdbclient/MultiVersionTransaction.actor.cpp:2149
#19 0x0000ffff9add1434 in fdb_run_network () at /home/anoyes/foundationdb/bindings/c/fdb_c.cpp:159
#20 0x00000000003abe5c in __invoke<int (*)()> (__f=<unknown type in /home/anoyes/build/bin/fdb_c_unit_tests, CU 0x0, DIE 0x43799>) at /usr/local/bin/../include/c++/v1/type_traits:3918
#21 __thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, int (*)()> (__t=...) at /usr/local/bin/../include/c++/v1/thread:280
#22 std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, int (*)()> >(void*) (__vp=0x1feaa90) at /usr/local/bin/../include/c++/v1/thread:291
#23 0x0000ffff9a780d38 in start_thread () from /lib64/libpthread.so.0
#24 0x0000ffff9a6d2690 in thread_start () from /lib64/libc.so.6
sfc-gh-anoyes commented 2 years ago

It appears that the offending read is 8 bytes past the end of a FastAlloc<16> allocation

Edit: I discovered this by using strace -k -f -e %memory bin/fdb_c_unit_tests ... to trace all the calls to mmap, mprotect, etc, and then noticing that the offending read is 8 bytes past the end of a blocked mmap'd by FastAlloc<16>

sfc-gh-anoyes commented 2 years ago

I think I understand. x8 is the address of an ArenaBlock of size 16.

   0x0000ffffaac81a04 <+872>:   ldrb    w9, [x8,#4]
=> 0x0000ffffaac81a08 <+876>:   ldr     x8, [x8,#24]
   0x0000ffffaac81a0c <+880>:   ldr     x10, [x20,#24]
   0x0000ffffaac81a10 <+884>:   cmp     w9, #0xff

Corresponding source code:

size_t ArenaBlock::estimatedTotalSize() const {
    if (isTiny()) {
        return size();
    }
    return totalSizeEstimate;
}

bool ArenaBlock::isTiny() const {
    return tinySize != NOT_TINY;
}

The generated code reads totalSizeEstimate before ~reading~ branching on the value of tinySize. I guess the compiler assumes that since sizeof(ArenaBlock) is 32 bytes that it's welcome to read bytes within [x8, x8 +32), which seems fair.