facebook / folly

An open-source C++ library developed and used at Facebook.
https://groups.google.com/forum/?fromgroups#!forum/facebook-folly
Apache License 2.0
28.47k stars 5.57k forks source link

Coredump when running global_executor_test #1252

Open dangleptr opened 5 years ago

dangleptr commented 5 years ago
[heng@host02-cluster _build]$ ./global_executor_test
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from GlobalExecutorTest
[ RUN      ] GlobalExecutorTest.GlobalCPUExecutor
[       OK ] GlobalExecutorTest.GlobalCPUExecutor (0 ms)
[ RUN      ] GlobalExecutorTest.GlobalIOExecutor
[       OK ] GlobalExecutorTest.GlobalIOExecutor (1 ms)
[----------] 2 tests from GlobalExecutorTest (1 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (2 ms total)
[  PASSED  ] 2 tests.
*** Aborted at 1572509134 (Unix time, try 'date -d @1572509134') ***
*** Signal 11 (SIGSEGV) (0xe0) received by PID 30858 (pthread TID 0x7f5b86094700) (linux TID 30860) (code: address not mapped to object), stack trace: ***
    @ 000000000086f5af _ZN5folly10symbolizer12_GLOBAL__N_118innerSignalHandlerEiP9siginfo_tPv
                       /home/heng/workspace/folly/folly/experimental/symbolizer/SignalHandler.cpp:431
    @ 000000000086f654 _ZN5folly10symbolizer12_GLOBAL__N_113signalHandlerEiP9siginfo_tPv
                       /home/heng/workspace/folly/folly/experimental/symbolizer/SignalHandler.cpp:446
    @ 00007f5b86e685ef (unknown)
    @ 00000000007a02e7 _ZNK5folly14ThreadLocalPtrINS_20SingletonThreadLocalISt10shared_ptrINS_14RequestContextEENS_6detail10DefaultTagENS5_11DefaultMakeIS4_EEvE7WrapperEvvE3getEv
                       /home/heng/workspace/folly/folly/ThreadLocal.h:159
                       -> /home/heng/workspace/folly/folly/io/async/Request.cpp
    @ 000000000079a797 _ZNK5folly11ThreadLocalINS_20SingletonThreadLocalISt10shared_ptrINS_14RequestContextEENS_6detail10DefaultTagENS5_11DefaultMakeIS4_EEvE7WrapperEvvEdeEv
                       /home/heng/workspace/folly/folly/ThreadLocal.h:68
                       -> /home/heng/workspace/folly/folly/io/async/Request.cpp
    @ 00000000007980c0 _ZN5folly20SingletonThreadLocalISt10shared_ptrINS_14RequestContextEENS_6detail10DefaultTagENS4_11DefaultMakeIS3_EEvE10getWrapperEv
                       /home/heng/workspace/folly/folly/SingletonThreadLocal.h:148
                       -> /home/heng/workspace/folly/folly/io/async/Request.cpp
    @ 00000000007980d4 _ZN5folly20SingletonThreadLocalISt10shared_ptrINS_14RequestContextEENS_6detail10DefaultTagENS4_11DefaultMakeIS3_EEvE13LocalLifetimeD1Ev
                       /home/heng/workspace/folly/folly/SingletonThreadLocal.h:120
                       -> /home/heng/workspace/folly/folly/io/async/Request.cpp
    @ 00007f5b8761eeb5 _ZN12_GLOBAL__N_13runEPv
                       /home/heng/workspace/gcc-8.3.0/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/../../.././libstdc++-v3/libsupc++/atexit_thread.cc:75
    @ 00007f5b86e60c61 __nptl_deallocate_tsd
    @ 00007f5b86e60e72 start_thread
    @ 00007f5b86b8988c clone
Segmentation fault (core dumped)

version: v2019.10.14.00

dangleptr commented 5 years ago
(gdb) bt
#0  0x000000000079cf5b in folly::ThreadLocalPtr<folly::SingletonThreadLocal<std::shared_ptr<folly::RequestContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<std::shared_ptr<folly::RequestContext> >, void>::Wrapper, void, void>::get (this=0x7f4ad8001760) at /home/heng/workspace/folly/folly/ThreadLocal.h:159
#1  0x000000000079740c in folly::ThreadLocal<folly::SingletonThreadLocal<std::shared_ptr<folly::RequestContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<std::shared_ptr<folly::RequestContext> >, void>::Wrapper, void, void>::get (this=0x7f4ad8001760) at /home/heng/workspace/folly/folly/ThreadLocal.h:68
#2  folly::ThreadLocal<folly::SingletonThreadLocal<std::shared_ptr<folly::RequestContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<std::shared_ptr<folly::RequestContext> >, void>::Wrapper, void, void>::operator* (this=0x7f4ad8001760) at /home/heng/workspace/folly/folly/ThreadLocal.h:77
#3  0x0000000000794d35 in folly::SingletonThreadLocal<std::shared_ptr<folly::RequestContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<std::shared_ptr<folly::RequestContext> >, void>::getWrapper
    () at /home/heng/workspace/folly/folly/SingletonThreadLocal.h:148
#4  0x0000000000794d49 in folly::SingletonThreadLocal<std::shared_ptr<folly::RequestContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<std::shared_ptr<folly::RequestContext> >, void>::LocalLifetime::~LocalLifetime (this=0x7f4add6def78, __in_chrg=<optimized out>) at /home/heng/workspace/folly/folly/SingletonThreadLocal.h:120
#5  0x00007f4adec70eb6 in (anonymous namespace)::run (p=<optimized out>) at ../../.././libstdc++-v3/libsupc++/atexit_thread.cc:75
#6  0x00007f4ade4b2c62 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#7  0x00007f4ade4b2e73 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f4ade1db88d in clone () from /lib64/libc.so.6
dangleptr commented 5 years ago

It seems that in IOThreadPool, the thread local variable has been released when accessing it. Any thoughts?

boringuy commented 4 years ago

@dangleptr What OS are you running? I run into this similar crash with CentOS74 + devtoolset7. So, I tried the same with a Ubuntu 19.10 Docker and don't see the same crash.

dangleptr commented 4 years ago

centos 7.5 + gcc 8.3.0 @boringuy

tobz commented 4 years ago

Anecdotally, observing this in mcrouter as well, which depends on folly: also on CentOS 7.5, GCC 5.x.

boringuy commented 4 years ago

I have not tried all the versions in between but confirmed the unit test cored with v2019.11.11.00 but does not core in v2019.12.30.00 and latest v2020.01.13.00

tobz commented 4 years ago

Woohoo! I can confirm that v2019.12.30.00 also fixes my issue. I too did not try bisecting, but I can say that v2019.12.02.00 still segfaulted for me.

dangleptr commented 4 years ago

thread_pool_executor_test SegFault on v2020.4.20 with the similar error.

#0  0x00000000009cac61 in folly::ThreadLocalPtr<folly::SingletonThreadLocal<folly::Optional<folly::BlockingContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<folly::Optional<folly::BlockingContext> >, void>::Wrapper, void, void>::get (this=0x7f6e94000af0) at /home/vesoft/data/chenheng/3rd/folly/folly/ThreadLocal.h:160
#1  0x00000000009ca0dc in folly::ThreadLocal<folly::SingletonThreadLocal<folly::Optional<folly::BlockingContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<folly::Optional<folly::BlockingContext> >, void>::Wrapper, void, void>::get (this=0x7f6e94000af0) at /home/vesoft/data/chenheng/3rd/folly/folly/ThreadLocal.h:69
#2  folly::ThreadLocal<folly::SingletonThreadLocal<folly::Optional<folly::BlockingContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<folly::Optional<folly::BlockingContext> >, void>::Wrapper, void, void>::operator* (
    this=0x7f6e94000af0) at /home/vesoft/data/chenheng/3rd/folly/folly/ThreadLocal.h:78
#3  0x00000000009c9e89 in folly::SingletonThreadLocal<folly::Optional<folly::BlockingContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<folly::Optional<folly::BlockingContext> >, void>::getWrapper ()
    at /home/vesoft/data/chenheng/3rd/folly/folly/SingletonThreadLocal.h:149
#4  0x00000000009c9e9d in folly::SingletonThreadLocal<folly::Optional<folly::BlockingContext>, folly::detail::DefaultTag, folly::detail::DefaultMake<folly::Optional<folly::BlockingContext> >, void>::LocalLifetime::~LocalLifetime (
    this=0x7f6e8effd508, __in_chrg=<optimized out>) at /home/vesoft/data/chenheng/3rd/folly/folly/SingletonThreadLocal.h:121
#5  0x00007f6ea01773b6 in (anonymous namespace)::run (p=<optimized out>) at ../../.././libstdc++-v3/libsupc++/atexit_thread.cc:75
#6  0x00007f6ea0474c62 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#7  0x00007f6ea0474e73 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f6e9f90188d in clone () from /lib64/libc.so.6

ThreadPoolExecutorTest.EDFBasic

lrita commented 1 year ago

folly-v2023.02.20.00 unit test ./synchronized_test crashed, compiled by devtoolset-8 gcc 8.3 and centos 7.8 and kernel 3.10.0-957.21.3.el7

#0  0x00000000005c3a99 in folly::ThreadLocalPtr<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::get (this=0x7fffa4000930)
    at ../folly/detail/ThreadLocalDetail.h:322
#1  folly::ThreadLocal<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::get (this=0x7fffa4000930)
    at ../folly/ThreadLocal.h:69
#2  folly::ThreadLocal<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::operator* (this=0x7fffa4000930)
    at ../folly/ThreadLocal.h:78
#3  folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag, folly::detail::DefaultMake<folly::ThreadLocalPRNG::operator()()::Wrapper>, folly::(anonymous namespace)::RandomTag>::getWrapper(void) () at ../folly/SingletonThreadLocal.h:138
#4  0x00000000005c3af9 in folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag, folly::detail::DefaultMake<folly::ThreadLocalPRNG::operator()()::Wrapper>, folly::(anonymous namespace)::RandomTag>::LocalLifetime::~LocalLifetime(void) (this=0x7fffc765f4e0, __in_chrg=<optimized out>)
    at ../folly/SingletonThreadLocal.h:128
#5  0x00000000006cc626 in (anonymous namespace)::run(void*) ()
#6  0x00007ffff74aaca2 in __nptl_deallocate_tsd () from /usr/lib64/libpthread.so.0
#7  0x00007ffff74aaeb3 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ffff71d39fd in clone () from /usr/lib64/libc.so.6
lrita commented 1 year ago

folly-v2023.02.20.00 unit test ./synchronized_test crashed, compiled by devtoolset-8 gcc 8.3 and centos 7.8 and kernel 3.10.0-957.21.3.el7

#0  0x00000000005c3a99 in folly::ThreadLocalPtr<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::get (this=0x7fffa4000930)
    at ../folly/detail/ThreadLocalDetail.h:322
#1  folly::ThreadLocal<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::get (this=0x7fffa4000930)
    at ../folly/ThreadLocal.h:69
#2  folly::ThreadLocal<folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag>::Wrapper, folly::(anonymous namespace)::RandomTag, void>::operator* (this=0x7fffa4000930)
    at ../folly/ThreadLocal.h:78
#3  folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag, folly::detail::DefaultMake<folly::ThreadLocalPRNG::operator()()::Wrapper>, folly::(anonymous namespace)::RandomTag>::getWrapper(void) () at ../folly/SingletonThreadLocal.h:138
#4  0x00000000005c3af9 in folly::SingletonThreadLocal<folly::ThreadLocalPRNG::operator()()::Wrapper, folly::(anonymous namespace)::RandomTag, folly::detail::DefaultMake<folly::ThreadLocalPRNG::operator()()::Wrapper>, folly::(anonymous namespace)::RandomTag>::LocalLifetime::~LocalLifetime(void) (this=0x7fffc765f4e0, __in_chrg=<optimized out>)
    at ../folly/SingletonThreadLocal.h:128
#5  0x00000000006cc626 in (anonymous namespace)::run(void*) ()
#6  0x00007ffff74aaca2 in __nptl_deallocate_tsd () from /usr/lib64/libpthread.so.0
#7  0x00007ffff74aaeb3 in start_thread () from /usr/lib64/libpthread.so.0
#8  0x00007ffff71d39fd in clone () from /usr/lib64/libc.so.6

CentOS 7.4 using glibc2.17,which has no __cxa_thread_atexit and __cxa_thread_atexit_impl support,then libstdc++ using TSD(pthread_key_create) implements thread_local variables' dtor invoking(which is shown as (anonymous namespace)::run(void*) () in the backtrace).

Unfortunately glibc TSD delete callbacks are not invoked in reverse order in which they were added. So this make ~LocalLifetime is invoked later than StaticMetaBase::onThreadExit.

HolyLow commented 6 months ago

Seems this bug still exists. I ran into the same coredump when using the folly-2024.04.01. My environment is AliOS-7 (based on centos I believe) with libc 2.32 :

$ uname -a
Linux 2a24bff1ef3d 4.9.168-019.ali3000.alios7.x86_64 #1 SMP Thu Dec 31 20:20:01 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
$ /lib64/libc.so.6
GNU C Library (GNU libc) release release version 2.32.
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 10.2.1 20200825 (Alibaba 10.2.1-3 2.17).
libc ABIs: UNIQUE IFUNC ABSOLUTE
For bug reporting instructions, please see:
<https://www.gnu.org/software/libc/bugs.html>.

And the stacktrace is

*** Signal 11 (SIGSEGV) (0x7f59dde2a6b9) received by PID 3634169 (pthread TID 0x7f5e2cf06640) (linux TID 3634170) (code: address not mapped to object), stack trace: ***
    @ 000000000058190c folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /code/thirdparty/src/folly-2024.04.01.00/folly/experimental/symbolizer/SignalHandler.cpp:449
    @ 00000000000140ff (unknown)
    @ 000000000050b96f folly::SingletonThreadLocal<folly::RequestContext::StaticContext, folly::RequestContext, folly::detail::DefaultMake<folly::RequestContext::StaticContext>, folly::RequestContext>::getWrapper()
                       /code/thirdparty/src/folly-2024.04.01.00/folly/ThreadLocal.h:151
                       -> /code/thirdparty/src/folly-2024.04.01.00/folly/io/async/Request.cpp
    @ 000000000050bac8 folly::SingletonThreadLocal<folly::RequestContext::StaticContext, folly::RequestContext, folly::detail::DefaultMake<folly::RequestContext::StaticContext>, folly::RequestContext>::LocalLifetime::~LocalLifetime()
                       /code/thirdparty/src/folly-2024.04.01.00/folly/SingletonThreadLocal.h:127
                       -> /code/thirdparty/src/folly-2024.04.01.00/folly/io/async/Request.cpp
    @ 00000000000a4a35 (unknown)
    @ 0000000000009210 __nptl_deallocate_tsd
    @ 000000000000940b start_thread
    @ 0000000000100302 clone

@lrita do you have any idea how to fix this? Thanks a lot.

HolyLow commented 6 months ago

Following the instructions in README.md, I tried to compile and run all the test cases of folly on my dev machine, and it appears that a large amount of testcases failed because of segfault.

The following tests FAILED:
        336 - atomic_shared_ptr_test.AtomicSharedPtr.DeterministicTest (SEGFAULT
)
        341 - cache_locality_test.CacheLocality.LinuxActual (Failed)
        354 - core_cached_shared_ptr_test.CoreCachedSharedPtr.AtomicCoreCachedSh
aredPtr (SEGFAULT)
        403 - dynamic_bounded_queue_test.DynamicBoundedQueue.enqDeq (SEGFAULT)
        420 - unbounded_queue_test.UnboundedQueue.enqDeq (SEGFAULT)
        474 - executor_test.ManualExecutor.getViaDoesNotDeadlock (SEGFAULT)
        484 - fiber_io_executor_test.FiberIOExecutorTest.event_base (SEGFAULT)
        487 - global_executor_test.GlobalExecutorTest.GlobalImmutableIOExecutor
(SEGFAULT)
        489 - global_executor_test.GlobalExecutorTest.GlobalIOExecutor (SEGFAULT
)
        503 - threaded_executor_test.ThreadedExecutorTest.exception (SEGFAULT)
        516 - priority_unbounded_blocking_queue_test.PriorityUnboundedBlockingQu
eueTest.concurrent_push_pop (SEGFAULT)
        522 - unbounded_blocking_queue_test.UnboundedBlockingQueue.concurrentPus
hPop (SEGFAULT)
        581 - lock_free_ring_buffer_test.LockFreeRingBuffer.writesNeverFail (SEG
FAULT)
        775 - callback_lifetime_test.CallbackLifetimeTest.thenReturnsValue (SEGF
AULT)
        776 - callback_lifetime_test.CallbackLifetimeTest.thenReturnsValueThrows
 (SEGFAULT)
        777 - callback_lifetime_test.CallbackLifetimeTest.thenReturnsFuture (SEG
FAULT)
        778 - callback_lifetime_test.CallbackLifetimeTest.thenReturnsFutureThrow
s (SEGFAULT)
        779 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsValueMatch (SEGFAULT)
        780 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsValueMatchThrows (SEGFAULT)
        781 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsValueWrong (SEGFAULT)
        782 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsValueWrongThrows (SEGFAULT)
        783 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsFutureMatch (SEGFAULT)
        784 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsFutureMatchThrows (SEGFAULT)
        785 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsFutureWrong (SEGFAULT)
        786 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesExnRetur
nsFutureWrongThrows (SEGFAULT)
        787 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesWrapRetu
rnsValue (SEGFAULT)
        788 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesWrapRetu
rnsValueThrows (SEGFAULT)
        789 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesWrapRetu
rnsFuture (SEGFAULT)
        790 - callback_lifetime_test.CallbackLifetimeTest.thenErrorTakesWrapRetu
rnsFutureThrows (SEGFAULT)
        904 - interrupt_test.Interrupt.futureWithinTimedOut (SEGFAULT)
        905 - interrupt_test.Interrupt.semiFutureWithinTimedOut (SEGFAULT)
        967 - retrying_test.RetryingTest.policyCappedJitteredExponentialBackoff
(SEGFAULT)
        968 - retrying_test.RetryingTest.policyCappedJitteredExponentialBackoffU
nsafe (SEGFAULT)
        971 - retrying_test.RetryingTest.policySleepDefaults (SEGFAULT)
        972 - retrying_test.RetryingTest.largeRetries (SEGFAULT)
        1031 - via_test.ViaFixture.threadHops (SEGFAULT)
        1032 - via_test.ViaFixture.chainVias (SEGFAULT)
        1034 - via_test.ViaFixture.viaAssignment (SEGFAULT)
        1038 - via_test.Via.viaThenGetWasRacy (SEGFAULT)
        1045 - via_test.Via.viaRaces (SEGFAULT)
        1063 - wait_test.Wait.waitWithDuration (SEGFAULT)
        1064 - wait_test.Wait.multipleWait (SEGFAULT)
        1065 - wait_test.Wait.WaitPlusThen (SEGFAULT)
        1066 - wait_test.Wait.cancelAfterWait (SEGFAULT)
        1080 - window_test.Window.parallel (SEGFAULT)
        1081 - window_test.Window.parallelWithError (SEGFAULT)
        1082 - window_test.Window.allParallelWithError (SEGFAULT)
        1283 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPong (SEGFAULT)
        1284 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPongNotify (SEG
FAULT)
        1285 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPongNotifyMmsg
(SEGFAULT)
        1286 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPongRecvTosDisa
bled (SEGFAULT)
        1287 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPongRecvTos (SE
GFAULT)
        1288 - AsyncUDPSocketTest.*/ConnectedAsyncSocketIntegrationTest.Connecte
dPingPong/* (SEGFAULT)
        1289 - AsyncUDPSocketTest.AsyncSocketIntegrationTest.PingPongPauseResume
Listening (SEGFAULT)
        1292 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestErrToNonExistentServer
(SEGFAULT)
        1293 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestUnsetErrCallback (SEGFA
ULT)
        1294 - AsyncUDPSocketTest.AsyncUDPSocketTest.CloseInErrorCallback (Timeo
ut)
        1295 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestNonExistentServerNoErrC
b (SEGFAULT)
        1300 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestDetachAttach (SEGFAULT)
        1301 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWriteCmsg (SEGFAULT)
        1302 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWriteDynamicCmsg (SEGFA
ULT)
        1304 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemCmsg (SEGFAULT)
        1305 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemDynamicCmsg (SEGF
AULT)
        1307 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemNontrivialCmsgs (
SEGFAULT)
        1321 - EventBaseLocalTest.EventBaseLocalTest.DestructorStressTest (SEGFA
ULT)
        1341 - NotificationQueueTest.NotificationQueueTest.ConsumeUntilDrained (
SEGFAULT)
        1342 - NotificationQueueTest.NotificationQueueTest.ConsumeUntilDrainedSt
ress (SEGFAULT)
        1343 - NotificationQueueTest.NotificationQueueTest.SendOneEventFD (SEGFA
ULT)
        1344 - NotificationQueueTest.NotificationQueueTest.PutMessagesEventFD (S
EGFAULT)
        1345 - NotificationQueueTest.NotificationQueueTest.MultiConsumerEventFD
(SEGFAULT)
        1349 - NotificationQueueTest.NotificationQueueTest.SendOnePipe (SEGFAULT
)
        1350 - NotificationQueueTest.NotificationQueueTest.PutMessagesPipe (SEGF
AULT)
        1351 - NotificationQueueTest.NotificationQueueTest.MultiConsumerPipe (SE
GFAULT)
        1355 - NotificationQueueTest.NotificationQueueTest.UseAfterFork (SEGFAUL
T)
        1374 - RequestContextTest.RequestContextTest.AccessAllThreadsDestruction
Guard (SEGFAULT)
        1376 - RequestContextTest.RequestContextTryGetTest.TryGetTest (SEGFAULT)
        1379 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.example (SEGF
AULT)
        1380 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.named_example
 (SEGFAULT)
        1381 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.default_manag
er (SEGFAULT)
        1382 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.custom_manage
r (SEGFAULT)
        1383 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.eb_dtor_in_io
_thread (SEGFAULT)
        1384 - ScopedEventBaseThreadTest.ScopedEventBaseThreadTest.keepalive (SE
GFAULT)
        1387 - writechain_test.WriteChainAsyncTransportWrapperTest.TestSimpleIov
 (SEGFAULT)
        1388 - writechain_test.WriteChainAsyncTransportWrapperTest.TestChainedIo
v (SEGFAULT)
        1389 - writechain_test.WriteChainAsyncTransportWrapperTest.TestSimpleBuf
 (SEGFAULT)
        1696 - baton_test.Baton.pingpongBlocking (SEGFAULT)
        1697 - baton_test.Baton.pingpongNonblocking (SEGFAULT)
        1700 - baton_test.Baton.timedWaitTimeoutSystemClockBlocking (SEGFAULT)
        1701 - baton_test.Baton.timedWaitTimeoutSystemClockNonblocking (SEGFAULT
)
        1702 - baton_test.Baton.timedWaitSystemClockBlocking (SEGFAULT)
        1703 - baton_test.Baton.timedWaitSystemClockNonblocking (SEGFAULT)
        1706 - baton_test.Baton.timedWaitTimeoutSteadyClockBlocking (SEGFAULT)
        1707 - baton_test.Baton.timedWaitTimeoutSteadyClockNonblocking (SEGFAULT
)
        1708 - baton_test.Baton.timedWaitSteadyClockBlocking (SEGFAULT)
        1709 - baton_test.Baton.timedWaitSteadyClockNonblocking (SEGFAULT)
        1721 - lifo_sem_test.LifoSemTest.pingpong (SEGFAULT)
        1722 - lifo_sem_test.LifoSemTest.mutex (SEGFAULT)
        1726 - lifo_sem_test.LifoSemTest.shutdown_multi (SEGFAULT)
        1822 - small_locks_test.SmallLocks.SpinLockCorrectness (SEGFAULT)
        2008 - deterministic_schedule_test.DeterministicSchedule.buggyAdd (SEGFA
ULT)
        2009 - deterministic_schedule_test.DeterministicSchedule.globalInvariant
s (SEGFAULT)
        2178 - file_util_test.WriteFileAtomic.directoryPermissions (Failed)
        2268 - futex_test.Futex.basicDeterministic (SEGFAULT)
        2301 - locks_test.SpinLock.Correctness (SEGFAULT)
        2346 - memory_idler_test.MemoryIdler.futexWaitValueChangedEarly (Timeout
)
        2347 - memory_idler_test.MemoryIdler.futexWaitValueChangedLate (Timeout)
        2348 - memory_idler_test.MemoryIdler.futexWaitAwokenEarly (Timeout)
        2349 - memory_idler_test.MemoryIdler.futexWaitAwokenLate (Timeout)
        2350 - memory_idler_test.MemoryIdler.futexWaitImmediateFlush (Timeout)
        2351 - memory_idler_test.MemoryIdler.futexWaitNeverFlush (Timeout)
        2443 - random_test.Random.MultiThreaded (SEGFAULT)
        2722 - synchronized_test.SynchronizedTimedTest/*.Timed (SEGFAULT)
        2723 - synchronized_test.SynchronizedTimedWithConstTest/*.TimedShared (S
EGFAULT)

And the error msgs are basicly like "signal 11 (SIGSEGV) (0x7fa7220d8e18) received by PID 3988029 (pthread TID 0x 7fa0e901c640) (linux TID 3988056) (code: address not mapped to object)".

I don't know what's wrong and how to fix this...

HolyLow commented 6 months ago

I tried to modify the above patch https://github.com/facebook/hhvm/commit/2469b030d57f64a9b39f86b80744237186edea29 to make it turned on for glibc-2.32, and seems many test cases have passed. The failure list shrinks to

The following tests FAILED:
        341 - cache_locality_test.CacheLocality.LinuxActual (Failed)
        1292 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestErrToNonExistentServer (SEGFAULT)
        1293 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestUnsetErrCallback (SEGFAULT)
        1294 - AsyncUDPSocketTest.AsyncUDPSocketTest.CloseInErrorCallback (Timeout)
        1295 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestNonExistentServerNoErrCb (SEGFAULT)
        1300 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestDetachAttach (SEGFAULT)
        1301 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWriteCmsg (SEGFAULT)
        1302 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWriteDynamicCmsg (SEGFAULT)
        1304 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemCmsg (SEGFAULT)
        1305 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemDynamicCmsg (SEGFAULT)
        1307 - AsyncUDPSocketTest.AsyncUDPSocketTest.TestWritemNontrivialCmsgs (SEGFAULT)
        1387 - writechain_test.WriteChainAsyncTransportWrapperTest.TestSimpleIov (SEGFAULT)
        1388 - writechain_test.WriteChainAsyncTransportWrapperTest.TestChainedIov (SEGFAULT)
        1389 - writechain_test.WriteChainAsyncTransportWrapperTest.TestSimpleBuf (SEGFAULT)
        2178 - file_util_test.WriteFileAtomic.directoryPermissions (Failed)
        2346 - memory_idler_test.MemoryIdler.futexWaitValueChangedEarly (Timeout)
        2347 - memory_idler_test.MemoryIdler.futexWaitValueChangedLate (Timeout)
        2348 - memory_idler_test.MemoryIdler.futexWaitAwokenEarly (Timeout)
        2349 - memory_idler_test.MemoryIdler.futexWaitAwokenLate (Timeout)
        2350 - memory_idler_test.MemoryIdler.futexWaitImmediateFlush (Timeout)
        2351 - memory_idler_test.MemoryIdler.futexWaitNeverFlush (Timeout)
HolyLow commented 5 months ago

UPDATE: The latest main branch of folly has enabled the patch for all GLIBC cases, so the problem should be solved. The patch is https://github.com/facebook/folly/commit/f141935cfc62b2147af1a5cc50e3692037710173#diff-be0ee3867901cddc207db866334f0820d4d77e333c583471f55ba517866c84cc