Closed fsalazarh closed 2 years ago
Thanks for this contribution, it makes sense. At some point we'd want to find a way to feed the pod topology information back into CouchDB's node metadata so we can be sure that the actual shard replicas are distributed to different placement zones when we have more cluster nodes than the defined replica level. But that can come later.
Can you generate a new chart version with make publish
and update the PR? It might also be good to update couchdb/README.md
documenting the new parameter.
Hi @kocolosk, I updated the chart version and added the parameter in docs.
"At some point we'd want to find a way to feed the pod topology information back into CouchDB's node metadata"
I've worked automating this feature in cluster initialization follow the approach of running the procedure from post-install
job of Helm.
I can open a new PR of this feature if you consider useful.
Thanks @fsalazarh this looks good. Just had a question about your comment in values.yaml
regarding EKS integration.
Thanks @fsalazarh this looks good. Just had a question about your comment in
values.yaml
regarding EKS integration.
Thanks @kocolosk, I updated the comment in values.yaml
to avoid confusion.
Thanks! If you wanted to investigate a deeper integration into CouchDB's shard placement logic, I think there are basically two steps:
[cluster] placement
config entry that says to put a replica in each labeled zone"zone"
attribute recording the zone where it's been scheduledThe second step is certainly the trickier part as it needs to be done e.g. in a sidecar post-scheduling (and will Kubernetes reschedule StatefulSet members into different zones over time?).
What this PR does / why we need it:
It extends the Helm Chart parameters to use the podTopologySpreadConstraint feature of kubernetes. This allows us to configure the distribution of couch nodes (pods) in a homogeneous way across the different AWS zones, for example.
The above is not as straightforward to achieve with
podAntiAffinity
, sotopologySpreadConstraints
is the best way to achieve this today.From docs:
You can use topology spread constraints to control how Pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.
More explicative information about
topologySpreadConstraints
here.This is important, since added to the possibility of configuring the placement of the couchdb shards, it allows us to have control over the distribution of the shards through the zones, achieving high availability of each shard.
Which issue this PR fixes
(optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged)Special notes for your reviewer:
Checklist
[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.