Altinity / clickhouse-operator

Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes
https://altinity.com
Apache License 2.0
1.87k stars 457 forks source link

Why creating service and statefulset for each shard and replica? #1082

Closed lazywei closed 1 year ago

lazywei commented 1 year ago

Hi, I'm trying out the operator by running example like https://github.com/Altinity/clickhouse-operator/blob/7ec06b79eb70430d546f9a085c92a5339c8f3e4a/docs/chi-examples/03-persistent-volume-01-default-volume.yaml

My questions after creating the cluster:

  1. It seems we create a service and a statefulset for each of the shard and replica (for example I have services chi-clickhouse-replicated-0-0, chi-clickhouse-replicated-1-0, chi-clickhouse-replicated-0-1, chi-clickhouse-replicated-1-1, and I also have four statefulsets)?
  2. Is there a way to disable them if I don't need these services? Couldn't we just connect to each of the pod via their pod name?
  3. Why there is always a trailing -0 in the pod name? I saw this is specified in the podNamePattern in https://github.com/Altinity/clickhouse-operator/blob/77025ca8d774b46d8381f5b45b92275d28fb2d45/pkg/model/namer.go#L91, but not quite sure what's the goal of that

Thanks!

Slach commented 1 year ago

It seems we create a service and a statefulset for each of the shard and replica (for example I have services chi-clickhouse-replicated-0-0, chi-clickhouse-replicated-1-0, chi-clickhouse-replicated-0-1, chi-clickhouse-replicated-1-1, and I also have four statefulsets)?

Yes, this is core design. It allow us manage statefulset flexible and make controlled rollout updates. services DNS name used during generate config section and it allow us flexible reconnect when one of clickhouse server restart in dynamic kubernetes world

Is there a way to disable them if I don't need these services? Couldn't we just connect to each of the pod via their pod name?

This approach has no sense You don't know current IP of pod (it could be changed), and you need services if you want to use DNS name for pod

Why there is always a trailing -0 in the pod name? I saw this is specified in the podNamePattern in

clickhouse-operator doesn't manage pods directly instead of this, operator manages statefulset with replica: 1 which generate pod + PVC and managed by kubernetes controller-manager components

-0 it's a standard name conversion for pod names inside statefulset with replica: 1, so we just follow this name conversion

lazywei commented 1 year ago

I see. Thanks. Is the reason to create one statefulset for each pod instead of having a statefulset with replica=N because we want to have more fine-grained control on things like pod/node affinity?

Slach commented 1 year ago

The main reason is flexibility. Control changes for each statefulset (each statefulset only one instance of clickhouse-server) via operator, instead of control each pod under controller-manager inside kubernetes inside statefulset for example, it allows us implements Pod Spread to AZ or to Host (look

there is design decision in russian language description https://habr.com/ru/post/523378/

https://translated.turbopages.org/proxy_u/ru-en.ru.e8f90585-63d2d05e-f29b7fdb-74722d776562/https/habr.com/ru/post/523378/