Closed kvaps closed 6 months ago
I don't like such idea as well as podSpec field.
I guess simple implementation with flat list of parameters inside spec will be simpler for understanding and usage.
Some good examples in my opinion:
Proposed design closely match to kubernetes spec but looks like a lot of operator prefer other, more simpler approach
I suggest to build few usecases like:
Then we can look on both approaches and compare more correctly
Also I guess we need to be aware of landscape around kubernetes. Let's imagine this usecase:
I guess simple implementation with flat list of parameters inside spec will be simpler for understanding and usage.
It makes sense, I just don't like the idea if parameters will repeat themself. It's better to have spec extendable but without sugar, then sugar without opportunity to extend :)
Proposed design closely match to kubernetes spec but looks like a lot of operator prefer other, more simpler approach
I would also mention ElasticSearch operator and piraeus-operator v2 (second major version, which reworked by mistakes from v1), both are using podTemplate
to define default fields for pods:
I suggest to build few usecases like:
- Simple cluster with few replicas
- Cluster with custom image
- Cluster with persistent storage
- Cluster with resources
- Cluster with all field specified
Then we can look on both approaches and compare more correctly
Great Idea, I will prepare a PR such examples and we can move discussion into review
I'm happy with the spec as of the time of this comment.
One thing I would like to note: A number of folks who I have talked to who run etcd clusters run them purely on nodes with local disk storage. Note this storage can be removed when nodes are being updated (and therefore reinitialized back to an initial empty directory). Currently: I see that done as etcd pods with an emptydir running in a clustered mode.
The operator will handle looking at the state of the cluster (etcd pods). If a pod dies or is removed: the operator will handle calling into the etcd cluster to remove the member, and then recalling back into the cluster to create the member and then also scheduling the pod associated with the member. Effectively: this allows pods to be able to move nodes/tolerate failures and automatically recover.
In this mode currently in the spec: I see we are using statefulsets only: I don't think using a stateful set necessarily precludes this mode: but we need to be able to use local volumes. Additionally: the etcd controller has to then be smart enough to know when a node that an etcd pod/volume was scheduled to has been fully removed, and then handle recreating the pv/pvc for the local volume to allow it to go on a different node.
More of a question: although not implemented in the controller backend yet (which I think can be extended) does the design support that use case?
Additionally: is there a mode where users can specify their own pki infrastructure (CA, certs, secrets) for the etcd cluster?
is there a mode where users can specify their own pki infrastructure (CA, certs, secrets) for the etcd cluster?
Yeah, right now there is a suggestion from @lllamnyp to add additional security
section for that, but not in this iteration:
spec:
security:
serverTLSSecretRef: # secretRef
name: server-tls-secret
clientCertAuth: true # bool
trustedCAFile: # secretRef
name: trusted-tls-secret # Client certificates may be signed by a different CA than server certificates
peerTLSSecretRef: # secretRef
name: peer-tls-secret # However, the CA for intra-cluster certificates is the same for both incoming and outgoing requests
peerClientCertAuth: true # bool
implementation design discussed here https://github.com/aenix-io/etcd-operator/pull/87
For now the most amount of information regarding auth and security topic is described in this issue https://github.com/aenix-io/etcd-operator/issues/76. The PR that is referred in the previous comment will include the results of discussion and implementation.
We are going to release MVP (v0.1.0) and we need a stable spec. Here is the corrected version of the spec from this proposal: https://github.com/aenix-io/etcd-operator/issues/62 (original author: @sergeyshevch).
I'm going to use this meta-issue to link all parts for implementing this: