ydb-platform / ydb

YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
https://ydb.tech
Apache License 2.0
4k stars 567 forks source link

Fix patching problems leading to data loss/corruption #1970

Open alexvru opened 9 months ago

alexvru commented 9 months ago
  1. Implement Patching mechanism in TestShard
  2. Detect problems by running TestShard over actual cluster with failure injection.
  3. Fix detected problems.
alexvru commented 9 months ago

https://github.com/ydb-platform/ydb/pull/1699 https://github.com/ydb-platform/ydb/pull/1943 https://github.com/ydb-platform/ydb/pull/1953

alexvru commented 9 months ago

Also there was a problem when TEvPatch returns ERROR while cluster is fully operational (after recent failure injection). This state did not pass until the full restart of the whole cluster. @kruall