[Discussion] Reconsider the way we write UNIT test

longfangsong commented 3 years ago

I fell our unit tests in TiKV somehow not satisfing.

For example when reviewing tikv#9514, which changed the rollback collapse logic, I found it breaked some test cases in mvcc::reader.

These test are used to test the behavour of reader and the engine.rollbacks are used only to provide a testing environment. i.e. They are not supposed to really do the "rollback" work here, what we want is just the Modifies which caused by the rollback. But a large part of this test depends on the Modifies caused by the unprotected rollbacks, and if we change the rollback collapse logic, the tests are broken.

Moreover, who can promise that the "rollback" or "cleanup" function we are using here is right? Maybe the tests for "cleanup", but "cleanup"'s correctness is depend on the correctness of reader, and lead us to circular reasoning.

It might be a bad habit to prepare unit testing environment for lower level componets (scanner in this case) with higher level components (txn and actions, or more specific, cleanup in this case), after all, lower level componets should not even know higher level components' existance. But huge amount of tests in our code are written in this way, and many of them are too hard/complex/important to change.

A pre-rfc I had written in the past.

longfangsong commented 3 years ago

IMO for the mvcc tests some DSL might be good, I have seen several interesting things in comments like:

// Initial position: 1 seek_to_first:
//   a_7 b_4 b_3 b_2 b_1 b_0
//   ^cursor
// After get the value, use 1 next to reach next user key:
//   a_7 b_4 b_3 b_2 b_1 b_0
//       ^cursor

Can we formalize this kind of things and make our unit tests clearer?

longfangsong commented 3 years ago

@nrc @youjiali1995 @sticnarf Do you have any ideas?

nrc commented 3 years ago

I agree, I think our tests are hard work to understand and to write. Ultimately, I think the problem is that we are not testing truly minimal units of code. The tests test a single action or feature, but they don't test one unit of code. On the other hand they are not full integration tests because they don't test the whole system. I think that is not necessarily bad, but that we can make the tests cleaner by treating them as integration tests on a large component (e.g, the transaction system). IMO, having a declarative test framework is the way to do this. We can treat tests as data rather than code.

AIUI, we would need ways to do the following:

specify an initial state
the operation to perform and specify anything that should be returned
assert the final state of the system

tikv / sig-transaction

[Discussion] Reconsider the way we write UNIT test #99