Open serathius opened 9 months ago
Hi, I'd like to take this up!
Thanks, feel free to reach out to me on slack (get an invite) if you need any help /assign @nitishfy
https://github.com/etcd-io/etcd/pull/17423 covered a single node cluster, we should also cover:
17423 covered a single node cluster, we should also cover:
- 3 node cluster
- different TLS configurations.
we can raise a new PR for it. WDYT?
SG, thanks
Also, when we are done, we should consider backporting the test.
Still remaining to be completed:
I think I can take this.
/assign
done 🙂 I think this can be closed now.
Awesome, thanks @ghouscht !
It looks like the test doesn't enable TLS by default. Should we also extend the test to support the case of enabling TLS?
Good point, I missed https://github.com/etcd-io/etcd/issues/17329#issuecomment-1943954845.
Oh I missed that as well, sorry! I'll have a look later on and will prepare another PR.
Something like this https://github.com/etcd-io/etcd/pull/18819? Or did you have something different in your mind?
What would you like to be added?
Etcd should not write any error logs during normal operation. We could add a test that starts etcd cluster, does some simple operations and validates if there is any error level log written.
This could be further improved by enabling this check in all e2e tests when cluster/server is being closed. As some test intentionally inject errors, we could make it configurable option that is enabled by default and disabled in all tests we expect error to be written.
Why is this needed?
Logs are used to provide useful debug information to users, however if mislabeled they could cause user to lose trust and ignore errors. For error logs to be useful they should only be emitted where there is something wrong.
We should prevent issues like https://github.com/etcd-io/etcd/issues/17245 where a change in code started generating a large number of error logs.
Proposed in https://github.com/etcd-io/etcd/pull/17249#issuecomment-1893399469 with agreement between maintainers.