Closed coffeebe4code closed 1 year ago
Hi @coffeebe4code Thanks for the report
Could you please clarify why the error log is related to compaction or snapshot bug or just describe the failure symptoms from your point of review?
From the pasted log, f9fd
was the leader and removed from the cluster membership. And f06b
and 7c24
elected f06b
as the new leader. This behavior looks working as expected.
I don't see any etcd issue, and there are lots of unrelated info.
I would suggest to raise an issue in Bitnami (?) community and triage this issue there firstly.
Please feel free to raise an issue with etcd related log and configuration if you see any etcd issue.
I'm sorry, but I don't understand about pushing this to bitnami. There should be nothing that causes etcd to cycle itself every other day.
I believe I have adequately described the failure symptoms, and provided logs.
If there was a true connection lost amongst the peers, or an issue with configuration (which i would still want your help), it wouldn't cycle all 3 pods so methodically, and provide ample log info and warnings for several minutes up until the cycle.
If you need more information please let me know what information it is and how I can attain it.
Please feel free to raise a new issue with complete etcd logs instead of just a couple of screen shots.
What happened?
Over the course of several minutes there are several elicited errors increasing in severity, until basically total crash.
"switched to configuraiton voters" log is the first log in over an hour. So I assume this is where the problem starts. Another error slightly after this one signifies the beginning of the end.
A new leader candidate has emerged.
The end error signifies one pod spinning up. So here is the order over the course of the next 3 minutes. A B C are the existing ones. X Y Z are the new ones.
A B C are alive X comes on A dies Y comes on B dies Z comes on C dies
As far as which one is the leader, I don't know. This is an extremely basic setup from my understanding.
What did you expect to happen?
not crash
How can we reproduce it (as minimally and precisely as possible)?
providing helm information below
Anything else we need to know?
No response
Etcd version (please run commands below)
used helm chart. version is 3.5.4
Etcd configuration (command line flags or environment variables)
Helm chart values.
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
everything else seems standard.
Here are the overrides.