apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.51k stars 1.31k forks source link

ParallelRestoreNewBackupCorrectness timeout #8794

Closed sfc-gh-yiwu closed 1 year ago

sfc-gh-yiwu commented 1 year ago

The test got segfault caused by infinite recursion in

_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()

Hash: b43779fdd57dab6f1c6c38da2744b2e6af19218a ASAN build Seed: bin/fdbserver -r simulation -f tests/slow/ParallelRestoreNewBackupCorrectnessAtomicOp.toml -s 1624312275 -b off --crash --trace_format json Another seed: bin/fdbserver -r simulation -f tests/slow/ParallelRestoreNewBackupCorrectnessMultiCycles.toml -s 805772227 -b on --crash --trace_format json

gdb output:

Program received signal SIGSEGV, Segmentation fault.
0x0000000006b2b05b in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
(gdb) bt
#0  0x0000000006b2b05b in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#1  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#2  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#3  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#4  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#5  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#6  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#7  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#8  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#9  0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#10 0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()
#11 0x0000000006b2b30c in (anonymous namespace)::_parsePartitionedLogFileOnLoaderActorState<(anonymous namespace)::_parsePartitionedLogFileOnLoaderActor>::a_body1cont2loopHead1(int) ()

(and ~8k more lines)

jzhou77 commented 1 year ago

I fixed a stack overflow bug yesterday in https://github.com/apple/foundationdb/pull/8773/commits/80e0810bcc52a15d91c7f752da4163c46061b022 for _parsePartitionedLogFileOnLoader(). Maybe the failure was before the fix?

sfc-gh-yiwu commented 1 year ago

@jzhou77 indeed it is before your fix and the failure didn't reappear. Closing.