facebook / mvfst

An implementation of the QUIC transport protocol.
MIT License
1.5k stars 242 forks source link

Some tests segfault on arm64 #363

Open sin-ack opened 4 days ago

sin-ack commented 4 days ago

While running the tests inside a QEMU arm64 chroot, I got the following segfaults:

 # ctest --rerun-failed --output-on-failure
Test project /var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build
    Start 889: TokenlessPacerTest.RateCalculator
1/4 Test #889: TokenlessPacerTest.RateCalculator ................................   Passed    0.12 sec
    Start 892: TokenlessPacerTest.NextWriteTime
2/4 Test #892: TokenlessPacerTest.NextWriteTime .................................   Passed    0.12 sec
    Start 932: */QuicClientTransportIntegrationTest.ResetClient/*
3/4 Test #932: */QuicClientTransportIntegrationTest.ResetClient/* ...............***Exception: SegFault 30.42 sec
Note: Google Test filter = */QuicClientTransportIntegrationTest.ResetClient/*
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from QuicClientTransportIntegrationTests/QuicClientTransportIntegrationTest
[ RUN      ] QuicClientTransportIntegrationTests/QuicClientTransportIntegrationTest.ResetClient/0
W20241112 12:40:21.974237  9672 QuicServerWorker.cpp:1223] HostId is already set to 0
W20241112 12:40:22.005918  9672 QuicServerWorker.cpp:1223] HostId is already set to 0
I20241112 12:40:22.170296  9675 EchoHandler.h:38] Got bidirectional stream id=0
I20241112 12:40:22.171144  9675 EchoHandler.h:102] read available for stream id=0
I20241112 12:40:22.171793  9675 EchoHandler.h:116] Got len=5 eof=1 total=5 data=hello
I20241112 12:40:22.172093  9675 EchoHandler.h:124] uninstalling read callback
I20241112 12:40:22.175437  9672 QuicClientTransportTest.cpp:304] Client received data=echo hello on stream=0 read=10 sent=5
E20241112 12:40:33.894991  9675 EchoHandler.h:97] Socket error=Connection abandoned Exceeded max PTO
unknown file: Failure
C++ exception with description "Timed out" thrown in the test body.
................   Passed    0.26 sec
*** Aborted at 1731415252 (Unix time, try 'date -d @1731415252') ***
*** Signal 11 (SIGSEGV) (0x635d0) received by PID 9672 (pthread TID 0x7f0a9ac075c0) (linux TID 9678) (code: address not mapped to object), stack trace: ***
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer17getStackTraceSafeEPmm+0x8b) [0x7f0aa8e8936b]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x4b) [0x7f0aa8e8b84b]
/usr/lib64/libfolly.so.0.58.0-dev(+0x28889f) [0x7f0aa8e8889f]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0x7f0aab4014ec]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/api/libmvfst_transport.so.0(_ZN4quic21QuicTransportBaseLite21cancelAllAppCallbacksERKNS_9QuicErrorE+0x8ab) [0x7f0aa9a9b6cb]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/api/libmvfst_transport.so.0(_ZN4quic21QuicTransportBaseLite9closeImplEN5folly8OptionalINS_9QuicErrorEEEbb+0x42b) [0x7f0aa9a9658b]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/client/libmvfst_client.so.0(_ZN4quic19QuicClientTransportD2Ev+0x18f) [0x7f0aaa499f9f]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/fizz/client/test/QuicClientTransportTest(_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE24_M_release_last_use_coldEv+0x1b) [0x7f0aaa84095b]
/usr/lib64/libfolly.so.0.58.0-dev(+0x3cee7f) [0x7f0aa8fcee7f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly23AtomicNotificationQueueINS_8FunctionIFvvEEEE5driveIRNS_9EventBase10FuncRunnerEEEbOT_+0x16b) [0x7f0aa8fd792b]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly32EventBaseAtomicNotificationQueueINS_8FunctionIFvvEEENS_9EventBase10FuncRunnerEE7executeEv+0x3b) [0x7f0aa8fd94eb]
/usr/lib64/libfolly.so.0.58.0-dev(_ZThn40_N5folly32EventBaseAtomicNotificationQueueINS_8FunctionIFvvEEENS_9EventBase10FuncRunnerEE12handlerReadyEt+0xf) [0x7f0aa8fd95c3]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly12EventHandler16libeventCallbackEisPv+0x77) [0x7f0aa8fe6097]
/usr/lib64/libevent-2.1.so.7(+0x1f9eb) [0x7f0aa807f9eb]
/usr/lib64/libevent-2.1.so.7(event_base_loop+0x447) [0x7f0aa80805f7]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase8loopMainEiNS0_11LoopOptionsE+0xdf) [0x7f0aa8fd124f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase8loopBodyEiNS0_11LoopOptionsE+0x47) [0x7f0aa8fd1cf7]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase4loopEv+0x5f) [0x7f0aa8fd1e8f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase11loopForeverEv+0x3f) [0x7f0aa8fd439f]
/usr/lib/gcc/aarch64-unknown-linux-gnu/13/libstdc++.so.6(+0xdb23b) [0x7f0aa88db23b]
/usr/lib64/libc.so.6(+0x8453b) [0x7f0aa86d453b]
(safe mode, symbolizer not available)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
1275/1924 Test  #671: */QuicPacketBuilderTest.ShortHeaderBytesCounting/* ............................................
    Start 933: */QuicClientTransportIntegrationTest.TestStatelessResetToken/*
4/4 Test #933: */QuicClientTransportIntegrationTest.TestStatelessResetToken/* ...***Exception: SegFault 30.43 sec
Note: Google Test filter = */QuicClientTransportIntegrationTest.TestStatelessResetToken/*
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from QuicClientTransportIntegrationTests/QuicClientTransportIntegrationTest
[ RUN      ] QuicClientTransportIntegrationTests/QuicClientTransportIntegrationTest.TestStatelessResetToken/0
W20241112 12:40:52.395910  9679 QuicServerWorker.cpp:1223] HostId is already set to 0
W20241112 12:40:52.426216  9679 QuicServerWorker.cpp:1223] HostId is already set to 0
I20241112 12:40:52.594803  9682 EchoHandler.h:38] Got bidirectional stream id=0
I20241112 12:40:52.595780  9682 EchoHandler.h:102] read available for stream id=0
I20241112 12:40:52.596454  9682 EchoHandler.h:116] Got len=5 eof=1 total=5 data=hello
I20241112 12:40:52.596779  9682 EchoHandler.h:124] uninstalling read callback
I20241112 12:40:52.599825  9679 QuicClientTransportTest.cpp:304] Client received data=echo hello on stream=0 read=10 sent=5
E20241112 12:41:04.609943  9682 EchoHandler.h:97] Socket error=Connection abandoned Exceeded max PTO
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00/quic/fizz/client/test/QuicClientTransportTest.cpp:975: Failure
Value of: resetRecvd
  Actual: false
Expected: true

/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00/quic/fizz/client/test/QuicClientTransportTest.cpp:977: Failure
Value of: token2.has_value()
  Actual: false
Expected: true

unknown file: Failure
C++ exception with description "Empty Optional cannot be unwrapped" thrown in the test body.

*** Aborted at 1731415282 (Unix time, try 'date -d @1731415282') ***
*** Signal 11 (SIGSEGV) (0x7f37f61ff4b0) received by PID 9679 (pthread TID 0x7f37e6a075c0) (linux TID 9685) (code: invalid permissions for mapped object), stack trace: ***
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer17getStackTraceSafeEPmm+0x8b) [0x7f37f4c8936b]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x4b) [0x7f37f4c8b84b]
/usr/lib64/libfolly.so.0.58.0-dev(+0x28889f) [0x7f37f4c8889f]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0x7f37f72064ec]
[0x7f37f61ff4af]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/api/libmvfst_transport.so.0(_ZN4quic21QuicTransportBaseLite21cancelAllAppCallbacksERKNS_9QuicErrorE+0x90f) [0x7f37f589b72f]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/api/libmvfst_transport.so.0(_ZN4quic21QuicTransportBaseLite9closeImplEN5folly8OptionalINS_9QuicErrorEEEbb+0x42b) [0x7f37f589658b]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/client/libmvfst_client.so.0(_ZN4quic19QuicClientTransportD2Ev+0x18f) [0x7f37f6299f9f]
/var/tmp/portage/dev-cpp/mvfst-2024.11.04.00/work/mvfst-2024.11.04.00_build/quic/fizz/client/test/QuicClientTransportTest(_ZNSt16_Sp_counted_baseILN9__gnu_cxx12_Lock_policyE2EE24_M_release_last_use_coldEv+0x1b) [0x7f37f664095b]
/usr/lib64/libfolly.so.0.58.0-dev(+0x3cee7f) [0x7f37f4dcee7f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly23AtomicNotificationQueueINS_8FunctionIFvvEEEE5driveIRNS_9EventBase10FuncRunnerEEEbOT_+0x16b) [0x7f37f4dd792b]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly32EventBaseAtomicNotificationQueueINS_8FunctionIFvvEEENS_9EventBase10FuncRunnerEE7executeEv+0x3b) [0x7f37f4dd94eb]
/usr/lib64/libfolly.so.0.58.0-dev(_ZThn40_N5folly32EventBaseAtomicNotificationQueueINS_8FunctionIFvvEEENS_9EventBase10FuncRunnerEE12handlerReadyEt+0xf) [0x7f37f4dd95c3]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly12EventHandler16libeventCallbackEisPv+0x77) [0x7f37f4de6097]
/usr/lib64/libevent-2.1.so.7(+0x1f9eb) [0x7f37e79af9eb]
/usr/lib64/libevent-2.1.so.7(event_base_loop+0x447) [0x7f37e79b05f7]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase8loopMainEiNS0_11LoopOptionsE+0xdf) [0x7f37f4dd124f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase8loopBodyEiNS0_11LoopOptionsE+0x47) [0x7f37f4dd1cf7]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase4loopEv+0x5f) [0x7f37f4dd1e8f]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly9EventBase11loopForeverEv+0x3f) [0x7f37f4dd439f]
/usr/lib/gcc/aarch64-unknown-linux-gnu/13/libstdc++.so.6(+0xdb23b) [0x7f37f46db23b]
/usr/lib64/libc.so.6(+0x8453b) [0x7f37f44d453b]
(safe mode, symbolizer not available)
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
50% tests passed, 2 tests failed out of 4

Total Test time (real) =  61.55 sec

The following tests FAILED:
        932 - */QuicClientTransportIntegrationTest.ResetClient/* (SEGFAULT)
        933 - */QuicClientTransportIntegrationTest.TestStatelessResetToken/* (SEGFAULT)
Errors while running CTest
jbeshay commented 3 days ago

Both failures seem to be triggered by exceptions unexpectedly firing in the tests:

C++ exception with description "Timed out" thrown in the test body.
C++ exception with description "Empty Optional cannot be unwrapped" thrown in the test body.

These look like legitimate test failures. Are they only failing in this environment?

sin-ack commented 3 days ago

They pass fine on amd64. I don't have access to arm64 hardware at the moment, but they fail inside the chroot.