apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.19k stars 1.29k forks source link

Enable replica consistency check on data movement #11415

Closed sbodagala closed 2 weeks ago

sbodagala commented 1 month ago

Enable replica consistency check on data movement and, randomly, on all reads.

Testing:

Joshua jobs:

20240524-203518-sre-1c1a2dcbba105340 (shows errors in restart tests). Another run had a single failure (on data movement not completing in "QuietDatabase", probably because replica consistency check was failing) but I don't see the problem when run locally (the test uses rocksdb, probably because of that). An update on this failure: This failure (or a similar failure) showed up on a SQLite test which we investigated. This failure was because the test was modifying a knob such that any transaction older than 1M versions would get failed by the resolver with "transaction_too_old" error. So I don't think this failure was caused by this change set.

20240613-224222-sre-b22072de0b076453: Shows a single test failure, and this test failed in a nightly run too.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

sbodagala commented 1 month ago

We now support the following policies on the number of replicas that the source replica will be compared against:

It appears to me this policy is asymmetric (as there is no way to specify BEST_EFFORT on a specific number of replicas). Probably we need a policy that captures the following:

But this will require us to propagate yet another parameter to the load balancer.

Thoughts?

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

sbodagala commented 1 month ago

I am still a bit nervous about throwing an error on not finding the specified number of replicas (by the replica comparison actor) in fetchKeys(), because fetchKeys() would cause the storage server to fail if "fetchKeyCanRetry()" returns false on such an error. Also, blocking data movement (and making it retry) might become an issue in production as it would require a manual intervention and cause us to restart the storage servers in order to make progress on data movement in extreme scenarios (like when a single storage server is available for a given shard). Spreading the knowledge about identifying/resolving data movement not making progress to all on-call people could also become an issue. Thoughts please?

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

sbodagala commented 1 month ago

I am still a bit nervous about throwing an error on not finding the specified number of replicas (by the replica comparison actor) in fetchKeys(), because fetchKeys() would cause the storage server to fail if "fetchKeyCanRetry()" returns false on such an error. Also, blocking data movement (and making it retry) might become an issue in production as it would require a manual intervention and cause us to restart the storage servers in order to make progress on data movement in extreme scenarios (like when a single storage server is available for a given shard). Spreading the knowledge about identifying/resolving data movement not making progress to all on-call people could also become an issue. Thoughts please?

Agree blocking data movement can become an issue. So far my thought is that this is better than moving corrupted data away and accepting the fact that when only one replica is left, data movement will be blocked. So for such a scenario, we should add visibility (trace events?) for oncalls to respond/decide, e.g., deploying a knob change to disable the replica comparison if needed.

I agree with this thought at a high level - we like to block data movement if an inconsistency is detected. But if there isn't another replica available then do we still want to block (because we couldn't validate the results)? Manual intervention in this case will only result in restarting the storage servers by turning the knob off (so data movement can happen without validation), so what are we achieving by blocking the data movement in this case? I guess the question is are there any scenarios where manual intervention/otherwise would cause the failed replicas to become available (even then they might be lagging behind the latest version and so might not help with validation?) so validation could be done?

sbodagala commented 1 month ago

Thinking more about it, I don't think we can guarantee that data movement will get blocked (indefinitely) when an inconsistency is detected. I think it will depend on how many replicas are available and how many of them are corrupted. This is because an inconsistency would cause fetchKeys() to retry and fetchKeys() could then choose a different source. But data movement would block if a corrupted replica gets chosen as either the source or the replica to compare the source against. So it will achieve the goal of not spreading corruption (even though it is not guaranteed to block data movement, or detect data corruption, in all possible cases).

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-macos on macOS Ventura 13.x

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 1 month ago

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-macos on macOS Ventura 13.x

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-clang-arm on Linux CentOS 7

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented 3 weeks ago

Result of foundationdb-pr on Linux CentOS 7

sbodagala commented 2 weeks ago

I think this PR can be merged.