Open kvaps opened 2 months ago
I would definitely like to drop these steps altogether.
Check cluster-state configmap
if configmap exists and
initial-cluster-members
defined
if there are any hostnames defined in
initial-cluster-members
take the hostname of pod with highest number and +1
- save value into
guessed
variable
This seems redundant, as we already have this info from checking the Endpoints object:
read pods pods that falls under StatefulSet label selector
if there are any pods
take the pod name with highest number and +1
if value is greater then value in
guessed
, save value intoguessed
variable
I don't like this step at all:
if value is greater then value in guessed, save value into guessed variable
IMO, if we found a value from a reliable source, such as member list
, we should never fall back to a less reliable source, such as "number of endpoints". Only if the more reliable source is unavailable (e.g. we cannot get member list
due to lack of quorum), should we try guessing the right number of replicas from Endpoints or PVCs.
@lllamnyp
I would definitely like to drop these steps:
Check cluster-state configmap
it is created at initial and keeps existing all the time. It should always contain correct infromation, until someone will remove it, why no using it?
read pods pods that falls under StatefulSet label selector This seems redundant, as we already have this info from checking the Endpoints object
Are all our pods always get into service endpoints? If so it can be omitted. Also is there any chance that by running this check service and endpoints will not be exising?
If we consider member list
as reliable source, then you're right, let's return
it directly
v2:
guessed=0
member list
guessed
, save value into guessed
variableinitial-cluster-members
definedinitial-cluster-members
guessed
variableguessed
, save value into guessed
variableguessed
Etcd-headless service will always have endpoints - it doesn't rely on readiness probes => so all created pods with ip addresses will be in the headless-service. This service is ensured in the very beginning => so it must exist.
I personally do not like checking cluster-state configmap because in the past we agreed that this is some kind of cache and it would be nice to get this info from etcd pvcs. So amount of pvcs in my opinion is more reliable source than cluster-state cm. So cm can be checked but as a last resort.
Okay it seems cluster-state
configmap check makes no sense, so removed:
v3:
guessed=0
member list
guessed
, save value into guessed
variableguessed
, save value into guessed
variableguessed
Okay it seems
cluster-state
configmap check makes no sense, so removed:v3:
- return
guessed
LGTM
According to the latest meeting
2024-06-18 MINUTES
we decided that we need a function that guesses the needed amount of etcd replicas.It can be used for recovering non-exising STS object and also for scaling from 0 Design ref: https://github.com/aenix-io/etcd-operator/pull/181
Proposal:
guessed=0
initial-cluster-members
definedinitial-cluster-members
guessed
variablemember list
guessed
, save value intoguessed
variableguessed
, save value intoguessed
variableguessed
, save value intoguessed
variableguessed
, save value intoguessed
variableguessed