etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.77k stars 9.77k forks source link

Unify testing framework #13637

Open serathius opened 2 years ago

serathius commented 2 years ago

Test flakiness and maintenance cost remains one of the larger issues for Etcd project. Proposals like https://github.com/etcd-io/etcd/issues/13167 to track flakiness and contributions to address singular tests have definitely helped however to fully resolve the problem we also need to address the root cause. Etcd projects currently maintains multiple disconnected ways to run tests (unit, integration, e2e, functional) that mostly verify the same scenarios. Testing features on different level is itself desired as it allows to isolate failures and speed up resolution, however not in a way that is currently done in Etcd. Almost zero test scenario and test framework is reused. This means that we have multiplied test code lines without any benefit.

Goal:

I propose to unify testing by identifying common test scenarios, test framework operations and making them accessable the same way no matter which test method is used (unit, integration, e2e, functional). For example simple put&get test scenario goes through very same stages no matter if this is integration or e2e cluster: create cluster, execute put, execute get, compare results, cleanup. This test should be written once and executed on different layers, no matter if underneath framework runs a fake grpc client or starts whole process. With this we will reduce number of code lines as each test will be needed to be implemented once, hide all the differences between test types reducing knowledge needed to add new tests and improve deflaking process by comparing same test results from different methods (we can isolate test location, if both e2e and integration tests fail then feature is broken, if only one of them then the test framework)

At this moment I would be most effective to target unifying integration and e2e tests as they are most similar in both cluster setup and test scenarios. We should be able to identify minimal common interface for cluster creation and communication that would allow us to start rewriting tests.

Plan:

cc @ptabor @ahrtr @spzala

ahrtr commented 2 years ago

It's really a good proposal. I am still very new to the existing test framework & cases, but I think the high level architecture would be something like below,

test_framework

serathius commented 2 years ago

We have managed to merged enough PRs to establish fundaments for the new framework (https://github.com/etcd-io/etcd/pull/13708, https://github.com/etcd-io/etcd/pull/13740, https://github.com/etcd-io/etcd/pull/13753, https://github.com/etcd-io/etcd/pull/13754). I think we can open the effort to more contributors to scale the migration. I think it would be a good idea to encourage new tests to be implemented in new framework.

I think we can do migration file by file each assigned to single person. I as already started I will continue working on e2e/ctl_v3_kv_test.go and e2e/ctl_v3_kv_no_quorum_test.go. I would start from migrating E2e tests as there are less of them and they are easier to migrate. However we should try to remove integration test cases if new common tests covers it.

Tests to be migrated:

Note for contributors: Feel free to pick one of the directories and just leave a comment that you want to work on it.

kkkkun commented 2 years ago

I will work on follow tests:

nic-chen commented 2 years ago

hi @serathius I would like to take some

nic-chen commented 2 years ago

I would start with the simple one: e2e/ctl_v3_alarm_test.go

nic-chen commented 2 years ago

will continue with: e2e/ctl_v3_txn_test.go e2e/ctl_v3_watch_test.go e2e/ctl_v3_watch_cov_test.go e2e/ctl_v3_watch_no_cov_test.go

nic-chen commented 2 years ago

hi @serathius, I tried to migrate ctl_v3_txn_test, but I found it difficult to be compatible with both e2e and integration.

Because the cli commands like version("key") < "0" need to be converted to and from Txn and Cmp, which is too complicated for testing, and we should use the original call to do it Tested rather than converted.

So I think maybe it shouldn't be migrated.

What is your suggestion? thanks!

serathius commented 2 years ago

Thanks for looking into this. Please skip the ctl_v3_txn_test for now. I will take a look and see what we can do.

nic-chen commented 2 years ago

Thanks for looking into this. Please skip the ctl_v3_txn_test for now. I will take a look and see what we can do.

got it, thanks.

vimalk78 commented 2 years ago

i am working on

chaochn47 commented 2 years ago

I can take a few more.

Update...

I just realized above 3 test files should not be classified as common tests shared by e2e and integration tests. For example, KeepDataDir, ExecPath and DataDir config is unique to e2e test. Also the last release version's etcd binary path is hard-coded and integration test with embed etcd won't have that concept..

I am inclined to skip the above tests and leave them as they are. WDYT? @serathius

Will continue the following tests tomorrow..

serathius commented 2 years ago

Makes sense, those tests cannot don't make sense for integration tests and will require different approach. Let's leave them for now. Thanks for pointing this out.

vimalk78 commented 2 years ago

anyone working on

serathius commented 2 years ago

It got closed because of the PRs that was merged included text fixes #issue-id in top comment, which is very (nice/bad) Github issue that allows anyone to close issue. I didn't notice it and merged the PR so I was marked as actor closing the issue. Sorry.

chaochn47 commented 2 years ago

oops. I will avoid the fixes #issue-id statement in the following tests migration PR then. I did not realize that functionality ==

clement2026 commented 2 years ago

I'm working on:

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

padlar commented 1 year ago

Going to work on e2e/ctl_v3_grpc_test.go

tayaleelin commented 1 year ago

I'm a new contributor. I will try migrating the test: e2e/discovery_test.go

cdalar commented 1 year ago

New Contributor. Will try to migrate e2e/ctl_v3_lease_test.go