apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.38k stars 1.3k forks source link

fdbbackup segfaults on v7.1.63 (works on v7.1.61) #11615

Open funkypenguin opened 2 weeks ago

funkypenguin commented 2 weeks ago

Hey all,

Since upgrading to v7.1.63, our backups have been failing, like this:

bash-4.2$ /usr/bin/fdbbackup abort -t hourly-${HOUR_OF_DAY} -C /etc/foundationdb/fdb.cluster --knob_http_request_aws_v4_header=false
SIGNAL: Segmentation fault (11)
Trace: addr2line -e fdbbackup.debug -p -C -f -i 0x7f46c7a24630 0xb66891 0x6152b0 0x5f8706 0x7f46c7669555
Segmentation fault (core dumped)
bash-4.2$ fdbbackup --version
FoundationDB 7.1 (v7.1.63)
source version 1accd8c90b87c4d193b269a9d71434cf9865f04d
protocol fdb00b071010000
bash-4.2$

However, on v7.1.61:

bash-4.2$ /usr/bin/fdbbackup abort -t hourly-${HOUR_OF_DAY} -C /etc/foundationdb/fdb.cluster --knob_http_request_aws_v4_header=false
ERROR: A backup was not running on tag `hourly-'
Fatal Error: Backup unneeded request
bash-4.2$ fdbbackup --version
FoundationDB 7.1 (v7.1.61)
source version 262c6fe778ad229cbb58bfcd6d337bb0b4227622
protocol fdb00b071010000
bash-4.2$

Could this be a new failure introduced in v7.1.63?

Can I safely downgrade my production cluster from v7.1.63 to v7.1.61, until this is resolved?

Thanks! D

jzhou77 commented 2 weeks ago

Yes, this is a regression in 7.1.63 and we have a fix https://github.com/apple/foundationdb/pull/11591.

You can safely downgrade to 7.1.61. And since the bug is in fdbbackup, not in fdbserver, so replacing the fdbbackup (and other binary with different names such as backup_agent, dr_agent) with the 7.1.61 version should solve the problem.