apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.33k stars 1.3k forks source link

DR simulation failure (6.3). #4995

Closed RenxuanW closed 2 years ago

RenxuanW commented 3 years ago

DR is not used in production but is causing failure in simulation.

Steps to reproduce:

I get:

<BackupAndRestoreCorrectness Severity="40" ErrorKind="Unset" DateTime="2021-06-16T03:15:00Z" Error="restore_destination_not_empty" ErrorDescription="Attempted to restore into a non-empty destination database" ErrorCode="2370" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x24e005c 0x24e1080 0x24e1411 0x16228a6 0x1639eeb 0x163be10 0x805859 0x15bbe72 0x15bc314 0x805859 0x1c089ae 0x8933d8 0x1e8d168 0x1f168c5 0x1e8d168 0x1f1c669 0x1f23c7b 0x1f24ec4 0x1f23b95 0x1f2a878 0x1df90a8 0x1e7ddc1 0x1e7ede3 0x1e674d8 0x1e67624 0x16ba688 0x81ac78 0x234a795 0x234ab0b 0xc17b00 0x23a4948 0x23c4800 0x23a4a4d 0x23c498f 0xc17b00 0x2488737 0x23c4b32 0x7b4694 0x7f0c60df8555" LogGroup="default" Roles="TS"/>
<TestFailure Severity="40" ErrorKind="Unset" DateTime="2021-06-16T03:15:00Z" Error="restore_destination_not_empty" ErrorDescription="Attempted to restore into a non-empty destination database" ErrorCode="2370" Reason="Error starting workload" Workload="CycleWorkload;BackupToDBCorrectness;RandomClogging;RollbackWorkload;MachineAttritionWorkload;MachineAttritionWorkload" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x24e005c 0x24e1080 0x24e1411 0x136714c 0x1367239 0x805859 0x853679 0x805859 0x805859 0x16228c2 0x1639eeb 0x163be10 0x805859 0x15bbe72 0x15bc314 0x805859 0x1c089ae 0x8933d8 0x1e8d168 0x1f168c5 0x1e8d168 0x1f1c669 0x1f23c7b 0x1f24ec4 0x1f23b95 0x1f2a878 0x1df90a8 0x1e7ddc1 0x1e7ede3 0x1e674d8 0x1e67624 0x16ba688 0x81ac78 0x234a795 0x234ab0b 0xc17b00 0x23a4948 0x23c4800 0x23a4a4d 0x23c498f 0xc17b00 0x2488737 0x23c4b32 0x7b4694 0x7f0c60df8555" LogGroup="default" Roles="TS"/>
<StartFailedForWorkloadBackupAndRestore Severity="40" ErrorKind="Unset" DateTime="2021-06-16T03:15:00Z" Error="operation_failed" ErrorDescription="Operation failed" ErrorCode="1000" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x24e005c 0x24e1080 0x24e1411 0x1386f6b 0x136f4aa 0x136f5f0 0x8933d8 0xb13a28 0xb4acd9 0xb4b4fa 0x805859 0x8563f3 0x856602 0x805859 0x81b5c0 0x234a795 0x234ab0b 0xc17b00 0x23a4948 0x23c4800 0x23a4a4d 0x23c498f 0xc17b00 0x2488737 0x23c4b32 0x7b4694 0x7f0c60df8555" LogGroup="default"/>
<RunTests Severity="40" ErrorKind="Unset" DateTime="2021-06-16T03:15:00Z" Error="operation_failed" ErrorDescription="Operation failed" ErrorCode="1000" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x24e005c 0x24e1080 0x24e1411 0xb1bf15 0xb1c082 0x805859 0x135fc91 0x805859 0x1362001 0x805859 0x8555a5 0x1346fe9 0x1372b38 0x1372d27 0x134eacd 0x134ede9 0x134e381 0x59ee09 0x8933d8 0xb13a28 0xb4acd9 0xb4b4fa 0x805859 0x8563f3 0x856602 0x805859 0x81b5c0 0x234a795 0x234ab0b 0xc17b00 0x23a4948 0x23c4800 0x23a4a4d 0x23c498f 0xc17b00 0x2488737 0x23c4b32 0x7b4694 0x7f0c60df8555" LogGroup="default"/>
<SetupAndRunError Severity="40" ErrorKind="Unset" DateTime="2021-06-16T03:15:00Z" Error="operation_failed" ErrorDescription="Operation failed" ErrorCode="1000" Backtrace="addr2line -e fdbserver.debug -p -C -f -i 0x24e005c 0x24e1080 0x24e1411 0x1163d6d 0x1163e95 0x805859 0xe590f1 0x805859 0x1352381 0x805859 0xb1bf3c 0xb1c082 0x805859 0x135fc91 0x805859 0x1362001 0x805859 0x8555a5 0x1346fe9 0x1372b38 0x1372d27 0x134eacd 0x134ede9 0x134e381 0x59ee09 0x8933d8 0xb13a28 0xb4acd9 0xb4b4fa 0x805859 0x8563f3 0x856602 0x805859 0x81b5c0 0x234a795 0x234ab0b 0xc17b00 0x23a4948 0x23c4800 0x23a4a4d 0x23c498f 0xc17b00 0x2488737 0x23c4b32 0x7b4694 0x7f0c60df8555" LogGroup="default"/>

This error is throw at https://github.com/apple/foundationdb/blob/release-6.3/fdbclient/DatabaseBackupAgent.actor.cpp#L2487.

sfc-gh-ljoswiak commented 3 years ago

We fixed this in 7.0 but never backported it to 6.3. I submitted the fix to the release-6.3 branch, and verified it fixes the specific seed you mentioned.