etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.42k stars 9.73k forks source link

Deflake etcd tests using K8s practices #16225

Open serathius opened 1 year ago

serathius commented 1 year ago

What would you like to be added?

Use practices from K8s project to find flaky etcd tests. Instructions https://gist.github.com/liggitt/6a3a2217fa5f846b52519acfc0ffece0#running-unit-tests-to-reproduce-flakes

Possibly enhance etcd testing practices.

Why is this needed?

It's fairly hard to reproduce locally flakes in CI. We can use K8s practices to improve our own.

fuweid commented 1 year ago

I was using taskset to simulate the GitHub VM. Good to know there is a tool named by stress. is it going to introduce new nightly workflow to run test in the stress and report it?

There are existing flaking cases.

Timestamp Test Case Name Package Count URL
2023-07-09T00:05:18.4191342Z TestLeasingPutGetDeleteConcurrent go.etcd.io/etcd/tests/v3/integration/clientv3/lease 1 https://github.com/etcd-io/etcd/actions/runs/5483999136
2023-07-09T00:05:18.4192094Z TestV3AuthWithLeaseRevokeWithRootJWT go.etcd.io/etcd/tests/v3/integration 1 https://github.com/etcd-io/etcd/actions/runs/5479149552
dejanzele commented 9 months ago

Hi @serathius @jmhbnz, I helped fix some flaky tests in Kubernetes so I am familiar with how to find & reproduce flaky behaviour locally. If this task is still relevant, I could go through the codebase and try to find some, I am interested in contributing to etcd, so that might be a good opportunity to get to know the codebase.

jmhbnz commented 9 months ago

Hey @dejanzele - Thanks for your interest in contributing to etcd!

We certainly need help ironing out some tests. I would suggest keeping an eye on our issue feed for reported flakes: https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+label%3Atype%2Fflake+no%3Aassignee

Or alternatively taking a look at failed workflows in the actions list and hunting flakes there. Here is a recent one https://github.com/etcd-io/etcd/actions/runs/7170857151/job/19524547673.

Thanks! 🙏🏻

stale[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.