Altinity / clickhouse-operator

Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes
https://altinity.com
Apache License 2.0
1.88k stars 459 forks source link

How to deploy clickhouse along with clickhouse-keeper using the clickhouse operator? #959

Open kant777 opened 2 years ago

kant777 commented 2 years ago

Currently the docs still point to Zookeeper and also says Zookeeper is required while Clickhouse 22.3 says Clickhouse-Keeper is production ready! I do see some k8's files for Clickhouse-Keeper but that is sort of implying to run Clickhouse-Keeper cluster separately just like ZK.

I am looking to run clickhouse cluster along with clickhouse keeper embedded into some of the Clickhouse nodes in the cluster. Would be great to have an example using k8's directly or through clickhouse operator?

alexvanolst commented 2 years ago

It's a bit heavy on configuration but it is possible. I'm hoping the operator will get support for this pretty soon but can be manually added.

Here's what I did to get it working:

rgarcia89 commented 2 years ago

@alexvanolst Is there somewhere an example? I cannot follow on what to do in the last step.

davidspek commented 1 year ago

I haven't fully tested this yet, but initially it seems to work. Sorry for the bad formatting here, but thought I'd share it quickly so others can at least play around with this. The value for the Zookeeper server is set as below:

servers:
    - host: service-posthog-clickhouse-0-0
      port: 9181
    - host: service-posthog-clickhouse-0-1
      port: 9181
    - host: service-posthog-clickhouse-0-2
      port: 9181

This host is generated from the serviceTemplate service-template (the one with generateName: service-{chi}).

I based this off of the example yaml from here. It can use some cleanup (like creating the config file in an init container instead with a shared volume) but I haven't gotten to that yet.

{{- if .Values.clickhouse.enabled }}
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: {{ template "posthog-plural.name" . }}-clickhouse
spec:
  defaults:
    templates:
      serviceTemplate: service-template
      replicaServiceTemplate: replica-service-template
  configuration:
    users:
      {{- template "clickhouse.passwordValue" . }}
      {{ .Values.clickhouse.user }}/networks/ip:
        {{- range $.Values.clickhouse.allowedNetworkIps }}
        - {{ . | quote }}
        {{- end }}
      {{ .Values.clickhouse.user }}/profile: default
      {{ .Values.clickhouse.user }}/quota: default
      {{- if .Values.clickhouse.backup.enabled }}
      {{ .Values.clickhouse.backup.backup_user }}/networks/ip: "0.0.0.0/0"
      {{ template "clickhouse.backupPasswordValue" . }}
      {{- end}}
      {{- if .Values.clickhouse.additionalUsersConfig }}
      {{- .Values.clickhouse.additionalUsersConfig | toYaml | nindent 6 }}
      {{- end}}
    profiles:
      {{- merge dict .Values.clickhouse.profiles .Values.clickhouse.defaultProfiles | toYaml | nindent 6 }}

    clusters:
      - name: {{ .Values.clickhouse.cluster | quote }}
        templates:
          podTemplate: pod-template
          clusterServiceTemplate: cluster-service-template
          {{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
          dataVolumeClaimTemplate: data-volumeclaim-template
          {{- end }}
        layout:
          {{- toYaml .Values.clickhouse.layout | nindent 10 }}

    settings:
      {{- merge dict .Values.clickhouse.settings .Values.clickhouse.defaultSettings | toYaml | nindent 6 }}

    files:
      events.proto: |
        syntax = "proto3";
        message Event {
          string uuid = 1;
          string event = 2;
          string properties = 3;
          string timestamp = 4;
          uint64 team_id = 5;
          string distinct_id = 6;
          string created_at = 7;
          string elements_chain = 8;
        }

    zookeeper:
      nodes:
      {{- if .Values.clickhouse.externalZookeeper }}
        {{- toYaml .Values.clickhouse.externalZookeeper.servers | nindent 8 }}
      {{- end }}

  templates:
    podTemplates:
      - name: pod-template
          {{- if .Values.clickhouse.podAnnotations }}
        metadata:
          annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
          {{- end }}
        {{- if .Values.clickhouse.podDistribution }}
        podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
        {{- end}}
        spec:
          {{- if .Values.clickhouse.affinity }}
          affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.tolerations }}
          tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.nodeSelector }}
          nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.topologySpreadConstraints }}
          topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.securityContext.enabled }}
          securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.image.pullSecrets }}
          imagePullSecrets:
            {{- range .Values.clickhouse.image.pullSecrets }}
            - name: {{ . }}
            {{- end }}
          {{- end }}

          containers:
            - name: clickhouse
              image: {{ template "posthog.clickhouse.image" . }}
              env:
              command:
                - /bin/bash
                - -c
                - /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml
              ports:
                - name: http
                  containerPort: 8123
                - name: client
                  containerPort: 9000
                - name: interserver
                  containerPort: 9009
              {{- if .Values.clickhouse.persistence.enabled }}
              volumeMounts:
              {{- if .Values.clickhouse.persistence.existingClaim }}
                - name: existing-volumeclaim
              {{- else }}
                - name: data-volumeclaim-template
              {{- end }}
                  mountPath: /var/lib/clickhouse
              {{- end }}

              {{- if .Values.clickhouse.resources }}
              resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
              {{- end }}
            {{- if .Values.clickhouse.backup.enabled }}
            - name: clickhouse-backup
              image: {{ template "posthog_backup.clickhouse.image" . }}
              imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
              command:
                - /bin/bash
                - -c
                - /bin/clickhouse-backup server
              {{- with .Values.clickhouse.backup.env }}
              env:
                {{- toYaml . | nindent 16 }}
              {{- end}}
              ports:
                - name: backup-rest
                  containerPort: 7171
            {{- end }}
      - name: pod-template-clickhouse-keeper
          {{- if .Values.clickhouse.podAnnotations }}
        metadata:
          annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
          {{- end }}
        {{- if .Values.clickhouse.podDistribution }}
        podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
        {{- end}}
        spec:
          {{- if .Values.clickhouse.affinity }}
          affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.tolerations }}
          tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.nodeSelector }}
          nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
          {{- end }}
          {{- if .Values.clickhouse.topologySpreadConstraints }}
          topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.securityContext.enabled }}
          securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
          {{- end }}

          {{- if .Values.clickhouse.image.pullSecrets }}
          imagePullSecrets:
            {{- range .Values.clickhouse.image.pullSecrets }}
            - name: {{ . }}
            {{- end }}
          {{- end }}

          containers:
            - name: clickhouse
              image: {{ template "posthog.clickhouse.image" . }}
              env:
              - name: KEEPER_SERVERS
                value: {{ .Values.clickhouse.layout.replicasCount | quote }}
              - name: RAFT_PORT
                value: "9444"
              command:
                - /bin/bash
                - -c
                - |
                  HOST=`hostname -s` &&
                  DOMAIN=`hostname -d` &&
                  if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)$ ]]; then
                      NAME=${BASH_REMATCH[1]}
                      ORD=${BASH_REMATCH[2]}
                      SUFFIX=${BASH_REMATCH[3]}
                  else
                      echo "Failed to parse name and ordinal of Pod"
                      exit 1
                  fi &&
                  if [[ $DOMAIN =~ (.*)-([0-9]+)(.posthog.svc.cluster.local)$ ]]; then
                      DOMAIN_NAME=${BASH_REMATCH[1]}
                      DOMAIN_ORD=${BASH_REMATCH[2]}
                      DOMAIN_SUFFIX=${BASH_REMATCH[3]}
                  else
                      echo "Failed to parse name and ordinal of Pod"
                      exit 1
                  fi &&
                  export MY_ID=$((ORD+1)) &&
                  mkdir -p /tmp/clickhouse-keeper/config.d/ &&
                  {
                    echo "<yandex><keeper_server>"
                    echo "<server_id>${MY_ID}</server_id>"
                    echo "<raft_configuration>"
                    for (( i=1; i<=$KEEPER_SERVERS; i++ )); do
                        echo "<server><id>${i}</id><hostname>$NAME-$((i-1))-${SUFFIX}.${DOMAIN_NAME}-$((i-1))${DOMAIN_SUFFIX}</hostname><port>${RAFT_PORT}</port></server>"
                    done
                    echo "</raft_configuration>"
                    echo "</keeper_server></yandex>"
                  } > /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
                  cat /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
                  /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml

              ports:
                - name: http
                  containerPort: 8123
                - name: client
                  containerPort: 9000
                - name: interserver
                  containerPort: 9009
                - name: raft
                  containerPort: 9444
                - name: ch-keeper
                  containerPort: 9181
              {{- if .Values.clickhouse.persistence.enabled }}
              volumeMounts:
              {{- if .Values.clickhouse.persistence.existingClaim }}
                - name: existing-volumeclaim
              {{- else }}
                - name: data-volumeclaim-template
              {{- end }}
                  mountPath: /var/lib/clickhouse
              {{- end }}
              # configures probes for clickhouse keeper
              # without this, traffic is not sent through the service and clickhouse keeper cannot start
              readinessProbe:
                tcpSocket:
                  port: 9444
                initialDelaySeconds: 10
                timeoutSeconds: 5
                periodSeconds: 10
                failureThreshold: 3
              livenessProbe:
                tcpSocket:
                  port: 9181
                initialDelaySeconds: 30
                timeoutSeconds: 5
                periodSeconds: 10

              {{- if .Values.clickhouse.resources }}
              resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
              {{- end }}
            {{- if .Values.clickhouse.backup.enabled }}
            - name: clickhouse-backup
              image: {{ template "posthog_backup.clickhouse.image" . }}
              imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
              command:
                - /bin/bash
                - -c
                - /bin/clickhouse-backup server
              {{- with .Values.clickhouse.backup.env }}
              env:
                {{- toYaml . | nindent 16 }}
              {{- end}}
              ports:
                - name: backup-rest
                  containerPort: 7171
            {{- end }}

    serviceTemplates:
      - name: service-template
        generateName: service-{chi}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: clickhouse-keeper
              port: 9181
          type: {{ .Values.clickhouse.serviceType }}
      - name: cluster-service-template
        generateName: service-{chi}-{cluster}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
          type: ClusterIP
          clusterIP: None
      - name: replica-service-template
        generateName: service-{chi}-{shard}-{replica}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: interserver
              port: 9009
          type: ClusterIP
      - name: replica-service-template-clickhouse-keeper
        generateName: service-{chi}-{shard}-{replica}
        spec:
          ports:
            - name: http
              port: 8123
            - name: tcp
              port: 9000
            - name: interserver
              port: 9009
            - name: clickhouse-keeper
              port: 9181
            - name: raft
              port: 9444
          type: ClusterIP

    {{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
    volumeClaimTemplates:
      - name: data-volumeclaim-template
        spec:
          {{- if .Values.clickhouse.persistence.storageClass }}
          storageClassName: {{ .Values.clickhouse.persistence.storageClass }}
          {{- end }}
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: {{ .Values.clickhouse.persistence.size | quote }}
    {{- end }}

{{- end }}
spoofedpacket commented 1 year ago

@DavidSpek That looks interesting. Is it part of a larger helm chart you could share? Thanks!

davidspek commented 1 year ago

@spoofedpacket Along the way we discovered some issues with the original solution I posted above, but I have now edited it. It's part of our helm chart for deploying PostHog using Plural. The current setup is working well and we are using it to run our PostHog production deployment. The template file for the ClickHouse instance can be found here, with the values being set here.

Some interesting things to note. To allow for increasing the amount of shards for scaling, we are using dedicated pod and service templates for the ClickHouse-Keeper replicas see here and here, and adding the ClickHouse-Keeper configuration file along with the special templates are being set for the 3 replicas of the first shard here.

davidspek commented 1 year ago

@spoofedpacket @alexvanolst I've since made a small helm chart that allows you to easily deploy clickhouse using the operator with support for using the built-in clickhouse keeper. See https://github.com/pluralsh/module-library/tree/main/helm/clickhouse

Slach commented 1 year ago

@DavidSpek according to https://github.com/pluralsh/module-library/blob/main/helm/clickhouse/templates/clickhouse_instance.yaml#L282

is your embedded clickhouse-keeper installation stable now? did you compare performance for zookeeper vs clickhouse-keeper?

davidspek commented 1 year ago

@Slach It seems to be working correctly and we haven't had any issues with it. However, I'm not a clickhouse expert nor have I had the time to compare performance with Zookeeper. I'm assuming upstream tests and performance evaluations will still be valid for this configuration. I do welcome any help with testing this setup from more experienced clickhouse users. Last night I've actually thought about how this could be handled be the operator so when I have the time I might look into implementing this.

Slach commented 1 year ago

@DavidSpek roger that. Anyway, thank you for your efforts! ;-)

davidspek commented 1 year ago

@Slach Would you be open to a contribution that applies a similar configuration in the operator? There it would be possible to do some more advanced logic in terms of distributing the keeper instances across the nodes?

Slach commented 1 year ago

Yes we open to contributions,

kind: ClickHouseKeeper should be implements as a separate CRD Each instance of clickhouse-keeper shall be deployed as separate statefulset (to allow separate manage) and could be link inside kind: ClickHouseInstallation and kind: ClickHouseInstallationTemplate CRDs

davidspek commented 1 year ago

@Slach Sorry for not getting back to you quicker about this. While I see the value of having a separate CRD for ClickHouseKeeper, would you also be open to adding the functionality for running an embedded ClickHouse Keeper within a ClickHouse installation? For me the main benefit of ClickHouse Keeper is less maintenance and resource overhead since it doesn't require dedicated pods.

Slach commented 1 year ago

@DavidSpek embedded clickhouse-keeper will restrict your scalability

because you need less clickhouse-keeeper instances, then clickhouse-server instances

for example, typical installations clickhouse-server two replicas per each shard, replicas could be in different DC

and usual clickhouse-keeper installation 1 or 3 instances to quick quorum odd number of keeper instances per datacenter, to avoid split brain situation

davidspek commented 1 year ago

@Slach I am aware of that, and also the fact that in almost all cases you wouldn't want to scale Raft past 5 or 7 nodes due to performance issues. However, the operator could implement some smart logic in terms how many clickhouse-keeper nodes to run and how those are spread across the regular clickhouse nodes. As a quick example:

If nodes < 3, run 1 keeper If nodes >= 3 and <= 5, run 3 keepers

And then some more comparisons can be done in terms of if the keepers can be split nicely across shards or replicas. It could even take the affinity rules into account so the nodes running keeper are in separate AZs.