etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.34k stars 9.72k forks source link

Unified test framework #14820

Closed ahrtr closed 1 year ago

ahrtr commented 1 year ago

What would you like to be added?

We reinvented some wheels on test frameworks. Currently there are functional test, e2e test and linearizability test (based on e2e). The Jepsen test is also an option, but it's written in Clojure, and out of the existing maintainers' control, so it isn't in the scope for now.

Each test framework has pros and cons.

Functional test

Pros

  1. The functional test has a well-designed framework, and it's flexible & extensible. Currently it only checks hash (and also has lease related check), but we can easily add more checker.
  2. It supports flexible failure injections via gofail. It lists all the available failpoints automatically, and users just need to configure the commands/terms in failpoint-commands, then it automatically creates all the possible combinations. For example, assuming there are 100 failpoints, and we define 3 commands/terms, then there will be 100*3 cases of failure injection.
  3. It also supports other kinds of failure injection, such as delay, network blackhole etc.

Cons

  1. It reinvented an agent to start/stop etcd clusters. Ideally it should use the same utilities as the e2e test.
  2. It's out of maintenance, and accordingly flaky. This shouldn't be a big deal, as long as we decided to spend more time on it.

Linearizability test

Pros

  1. Linearizability is really important for distributed system. It's good to have a test framework to focus on verifying the Linearizability of etcd.

Crons

  1. It's based on a personal repo porcupine, which isn't actively maintained. When a test case reports failure, I don't think we even have enough confidence that the failure isn't caused by the test framework/utility.
  2. It isn't well-designed, and of course isn't flexible & easily extensible. All the concepts (e.g. traffic player, failure injection, result checker etc.) are basically tightly coupled.
  3. It's doing things that are beyond what the concept of Linearizability can hold. Conceptually, Linearizability test is just part of the functional test, but it's trying to do the same thing as the functional test.

Unified test framework

So we really need to create a unified test framework, otherwise future maintainers may reinvent a new test framework. It also isn't good to add too many code to the existing Linearizability test without having a well-designed framework beforehand.

The high level diagram is as below. Actually the high level structure is basically coming from the existing functional test. The components highlighted with orange are new to functional test. We still need more breakdown on the high level design, but please always keep OCP (Open-Close principle) and DIP (Dependency inversion principle) in mind.

unified_test_framework

Why is this needed?

See above.

cc @mitake @ptabor @serathius @spzala

ahrtr commented 1 year ago

I would expect more contributors can get involved in the design, so as to have deeper understanding on the unified test framework. I am also thinking probably we can have more maintainers to dedicated to the maintenance of the test framework, just like the maintainer (e.g. @tbg ) dedicated to raft.

serathius commented 1 year ago

Maybe I'm little biased (autored one of the framework), but I totally don't understand your point. I have totally different perception on each of the arguments you proposed that I think we need a larger alignment before moving forward.

On functional framwork

I don't agree that it is well design. It's overdesigned as it splits the test runner into multiple binaries and designing an internal rpc just to run tests that can be run locally.

It's not flexible neither extensible as adding a simple test scenario requires code changes through tens of lines and proto. For example for first data inconsistency I just wanted to add a test that does SIGKILL instead of SIGTERM https://github.com/etcd-io/etcd/pull/13924.

Supporting gofail is not a pros. It's just a feature that is already implemented by linearizabilityt tests. Fact that last data durability issue and inconsistency with panic were not discovered by funtional tests is just proof that this doesn't work at all. Both of those issues are easily discoverable via gofailure injection and were discovered/confirmed via linearizability tests.

Supporting network black holing is not a pros, it's a feature.

On linearizability tests

Whole idea about confidence on test failure is the main advantage of linearizability tests over functional tests. Functional tests have very high flakiness vs Linearizability tests have never flaked.

It's hard to me comment on whether linearizabilityt tests are well design, as I'm biased. I prioritized simplicity over everything else. OCP and DIP rules might be nice but I prefer KISS, keep it simple stupid. I though you agreed with that as you personally reviewed the code https://github.com/etcd-io/etcd/pull/14398.

As for concept of linearizability, it's just a name not con.

Overall I agree that we can't develop two frameworks, however I don't agree with proposed direction as I don't see superiority of current functional framework. My reasons to develop new test framework:

This issue is in hard opposition to https://github.com/etcd-io/etcd/issues/14045

serathius commented 1 year ago

Meet with @ahrtr and we agreed on couple of things:

Detail about merger are not set in stone, but I my personal opinion is that best way to proceed is to fill feature gaps in linearizability tests and replace funtional tests with it. After that we can rename linearizability tests to functional.

serathius commented 1 year ago

Superseded by https://github.com/etcd-io/etcd/issues/14820