Open kant777 opened 2 years ago
It's a bit heavy on configuration but it is possible. I'm hoping the operator will get support for this pretty soon but can be manually added.
Here's what I did to get it working:
replicaServiceTemplate
you'll need to expose ports for zookeeper (2181) as well as raft (9444)serviceTemplate
you'll just need to also expose 2181<keeper_server>
configuration that has the correct <server_id>
for that pod as well as the raft configuration pointing to all the other instances that will be running clickhouse keeper@alexvanolst Is there somewhere an example? I cannot follow on what to do in the last step.
I haven't fully tested this yet, but initially it seems to work. Sorry for the bad formatting here, but thought I'd share it quickly so others can at least play around with this. The value for the Zookeeper server is set as below:
servers:
- host: service-posthog-clickhouse-0-0
port: 9181
- host: service-posthog-clickhouse-0-1
port: 9181
- host: service-posthog-clickhouse-0-2
port: 9181
This host is generated from the serviceTemplate service-template
(the one with generateName: service-{chi}
).
I based this off of the example yaml from here. It can use some cleanup (like creating the config file in an init container instead with a shared volume) but I haven't gotten to that yet.
{{- if .Values.clickhouse.enabled }}
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: {{ template "posthog-plural.name" . }}-clickhouse
spec:
defaults:
templates:
serviceTemplate: service-template
replicaServiceTemplate: replica-service-template
configuration:
users:
{{- template "clickhouse.passwordValue" . }}
{{ .Values.clickhouse.user }}/networks/ip:
{{- range $.Values.clickhouse.allowedNetworkIps }}
- {{ . | quote }}
{{- end }}
{{ .Values.clickhouse.user }}/profile: default
{{ .Values.clickhouse.user }}/quota: default
{{- if .Values.clickhouse.backup.enabled }}
{{ .Values.clickhouse.backup.backup_user }}/networks/ip: "0.0.0.0/0"
{{ template "clickhouse.backupPasswordValue" . }}
{{- end}}
{{- if .Values.clickhouse.additionalUsersConfig }}
{{- .Values.clickhouse.additionalUsersConfig | toYaml | nindent 6 }}
{{- end}}
profiles:
{{- merge dict .Values.clickhouse.profiles .Values.clickhouse.defaultProfiles | toYaml | nindent 6 }}
clusters:
- name: {{ .Values.clickhouse.cluster | quote }}
templates:
podTemplate: pod-template
clusterServiceTemplate: cluster-service-template
{{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
dataVolumeClaimTemplate: data-volumeclaim-template
{{- end }}
layout:
{{- toYaml .Values.clickhouse.layout | nindent 10 }}
settings:
{{- merge dict .Values.clickhouse.settings .Values.clickhouse.defaultSettings | toYaml | nindent 6 }}
files:
events.proto: |
syntax = "proto3";
message Event {
string uuid = 1;
string event = 2;
string properties = 3;
string timestamp = 4;
uint64 team_id = 5;
string distinct_id = 6;
string created_at = 7;
string elements_chain = 8;
}
zookeeper:
nodes:
{{- if .Values.clickhouse.externalZookeeper }}
{{- toYaml .Values.clickhouse.externalZookeeper.servers | nindent 8 }}
{{- end }}
templates:
podTemplates:
- name: pod-template
{{- if .Values.clickhouse.podAnnotations }}
metadata:
annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.podDistribution }}
podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
{{- end}}
spec:
{{- if .Values.clickhouse.affinity }}
affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.tolerations }}
tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.nodeSelector }}
nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.topologySpreadConstraints }}
topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.securityContext.enabled }}
securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.image.pullSecrets }}
imagePullSecrets:
{{- range .Values.clickhouse.image.pullSecrets }}
- name: {{ . }}
{{- end }}
{{- end }}
containers:
- name: clickhouse
image: {{ template "posthog.clickhouse.image" . }}
env:
command:
- /bin/bash
- -c
- /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml
ports:
- name: http
containerPort: 8123
- name: client
containerPort: 9000
- name: interserver
containerPort: 9009
{{- if .Values.clickhouse.persistence.enabled }}
volumeMounts:
{{- if .Values.clickhouse.persistence.existingClaim }}
- name: existing-volumeclaim
{{- else }}
- name: data-volumeclaim-template
{{- end }}
mountPath: /var/lib/clickhouse
{{- end }}
{{- if .Values.clickhouse.resources }}
resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
{{- end }}
{{- if .Values.clickhouse.backup.enabled }}
- name: clickhouse-backup
image: {{ template "posthog_backup.clickhouse.image" . }}
imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
command:
- /bin/bash
- -c
- /bin/clickhouse-backup server
{{- with .Values.clickhouse.backup.env }}
env:
{{- toYaml . | nindent 16 }}
{{- end}}
ports:
- name: backup-rest
containerPort: 7171
{{- end }}
- name: pod-template-clickhouse-keeper
{{- if .Values.clickhouse.podAnnotations }}
metadata:
annotations: {{ toYaml .Values.clickhouse.podAnnotations | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.podDistribution }}
podDistribution: {{ toYaml .Values.clickhouse.podDistribution | nindent 12 }}
{{- end}}
spec:
{{- if .Values.clickhouse.affinity }}
affinity: {{ toYaml .Values.clickhouse.affinity | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.tolerations }}
tolerations: {{ toYaml .Values.clickhouse.tolerations | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.nodeSelector }}
nodeSelector: {{ toYaml .Values.clickhouse.nodeSelector | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.topologySpreadConstraints }}
topologySpreadConstraints: {{ toYaml .Values.clickhouse.topologySpreadConstraints | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.securityContext.enabled }}
securityContext: {{- omit .Values.clickhouse.securityContext "enabled" | toYaml | nindent 12 }}
{{- end }}
{{- if .Values.clickhouse.image.pullSecrets }}
imagePullSecrets:
{{- range .Values.clickhouse.image.pullSecrets }}
- name: {{ . }}
{{- end }}
{{- end }}
containers:
- name: clickhouse
image: {{ template "posthog.clickhouse.image" . }}
env:
- name: KEEPER_SERVERS
value: {{ .Values.clickhouse.layout.replicasCount | quote }}
- name: RAFT_PORT
value: "9444"
command:
- /bin/bash
- -c
- |
HOST=`hostname -s` &&
DOMAIN=`hostname -d` &&
if [[ $HOST =~ (.*)-([0-9]+)-([0-9]+)$ ]]; then
NAME=${BASH_REMATCH[1]}
ORD=${BASH_REMATCH[2]}
SUFFIX=${BASH_REMATCH[3]}
else
echo "Failed to parse name and ordinal of Pod"
exit 1
fi &&
if [[ $DOMAIN =~ (.*)-([0-9]+)(.posthog.svc.cluster.local)$ ]]; then
DOMAIN_NAME=${BASH_REMATCH[1]}
DOMAIN_ORD=${BASH_REMATCH[2]}
DOMAIN_SUFFIX=${BASH_REMATCH[3]}
else
echo "Failed to parse name and ordinal of Pod"
exit 1
fi &&
export MY_ID=$((ORD+1)) &&
mkdir -p /tmp/clickhouse-keeper/config.d/ &&
{
echo "<yandex><keeper_server>"
echo "<server_id>${MY_ID}</server_id>"
echo "<raft_configuration>"
for (( i=1; i<=$KEEPER_SERVERS; i++ )); do
echo "<server><id>${i}</id><hostname>$NAME-$((i-1))-${SUFFIX}.${DOMAIN_NAME}-$((i-1))${DOMAIN_SUFFIX}</hostname><port>${RAFT_PORT}</port></server>"
done
echo "</raft_configuration>"
echo "</keeper_server></yandex>"
} > /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
cat /tmp/clickhouse-keeper/config.d/generated-keeper-settings.xml &&
/usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml
ports:
- name: http
containerPort: 8123
- name: client
containerPort: 9000
- name: interserver
containerPort: 9009
- name: raft
containerPort: 9444
- name: ch-keeper
containerPort: 9181
{{- if .Values.clickhouse.persistence.enabled }}
volumeMounts:
{{- if .Values.clickhouse.persistence.existingClaim }}
- name: existing-volumeclaim
{{- else }}
- name: data-volumeclaim-template
{{- end }}
mountPath: /var/lib/clickhouse
{{- end }}
# configures probes for clickhouse keeper
# without this, traffic is not sent through the service and clickhouse keeper cannot start
readinessProbe:
tcpSocket:
port: 9444
initialDelaySeconds: 10
timeoutSeconds: 5
periodSeconds: 10
failureThreshold: 3
livenessProbe:
tcpSocket:
port: 9181
initialDelaySeconds: 30
timeoutSeconds: 5
periodSeconds: 10
{{- if .Values.clickhouse.resources }}
resources: {{ toYaml .Values.clickhouse.resources | nindent 16 }}
{{- end }}
{{- if .Values.clickhouse.backup.enabled }}
- name: clickhouse-backup
image: {{ template "posthog_backup.clickhouse.image" . }}
imagePullPolicy: {{ .Values.clickhouse.backup.image.pullPolicy }}
command:
- /bin/bash
- -c
- /bin/clickhouse-backup server
{{- with .Values.clickhouse.backup.env }}
env:
{{- toYaml . | nindent 16 }}
{{- end}}
ports:
- name: backup-rest
containerPort: 7171
{{- end }}
serviceTemplates:
- name: service-template
generateName: service-{chi}
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
- name: clickhouse-keeper
port: 9181
type: {{ .Values.clickhouse.serviceType }}
- name: cluster-service-template
generateName: service-{chi}-{cluster}
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
type: ClusterIP
clusterIP: None
- name: replica-service-template
generateName: service-{chi}-{shard}-{replica}
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
- name: interserver
port: 9009
type: ClusterIP
- name: replica-service-template-clickhouse-keeper
generateName: service-{chi}-{shard}-{replica}
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
- name: interserver
port: 9009
- name: clickhouse-keeper
port: 9181
- name: raft
port: 9444
type: ClusterIP
{{- if and (.Values.clickhouse.persistence.enabled) (not .Values.clickhouse.persistence.existingClaim) }}
volumeClaimTemplates:
- name: data-volumeclaim-template
spec:
{{- if .Values.clickhouse.persistence.storageClass }}
storageClassName: {{ .Values.clickhouse.persistence.storageClass }}
{{- end }}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.clickhouse.persistence.size | quote }}
{{- end }}
{{- end }}
@DavidSpek That looks interesting. Is it part of a larger helm chart you could share? Thanks!
@spoofedpacket Along the way we discovered some issues with the original solution I posted above, but I have now edited it. It's part of our helm chart for deploying PostHog using Plural. The current setup is working well and we are using it to run our PostHog production deployment. The template file for the ClickHouse instance can be found here, with the values being set here.
Some interesting things to note. To allow for increasing the amount of shards for scaling, we are using dedicated pod and service templates for the ClickHouse-Keeper replicas see here and here, and adding the ClickHouse-Keeper configuration file along with the special templates are being set for the 3 replicas of the first shard here.
@spoofedpacket @alexvanolst I've since made a small helm chart that allows you to easily deploy clickhouse using the operator with support for using the built-in clickhouse keeper. See https://github.com/pluralsh/module-library/tree/main/helm/clickhouse
@DavidSpek according to https://github.com/pluralsh/module-library/blob/main/helm/clickhouse/templates/clickhouse_instance.yaml#L282
is your embedded clickhouse-keeper installation stable now? did you compare performance for zookeeper vs clickhouse-keeper?
@Slach It seems to be working correctly and we haven't had any issues with it. However, I'm not a clickhouse expert nor have I had the time to compare performance with Zookeeper. I'm assuming upstream tests and performance evaluations will still be valid for this configuration. I do welcome any help with testing this setup from more experienced clickhouse users. Last night I've actually thought about how this could be handled be the operator so when I have the time I might look into implementing this.
@DavidSpek roger that. Anyway, thank you for your efforts! ;-)
@Slach Would you be open to a contribution that applies a similar configuration in the operator? There it would be possible to do some more advanced logic in terms of distributing the keeper instances across the nodes?
Yes we open to contributions,
kind: ClickHouseKeeper
should be implements as a separate CRD
Each instance of clickhouse-keeper
shall be deployed as separate statefulset (to allow separate manage)
and could be link inside kind: ClickHouseInstallation
and kind: ClickHouseInstallationTemplate
CRDs
@Slach Sorry for not getting back to you quicker about this. While I see the value of having a separate CRD for ClickHouseKeeper, would you also be open to adding the functionality for running an embedded ClickHouse Keeper within a ClickHouse installation? For me the main benefit of ClickHouse Keeper is less maintenance and resource overhead since it doesn't require dedicated pods.
@DavidSpek embedded clickhouse-keeper
will restrict your scalability
because you need less clickhouse-keeeper
instances, then clickhouse-server
instances
for example, typical installations clickhouse-server
two replicas per each shard, replicas could be in different DC
and usual clickhouse-keeper
installation
1 or 3 instances to quick quorum
odd number of keeper instances per datacenter, to avoid split brain situation
@Slach I am aware of that, and also the fact that in almost all cases you wouldn't want to scale Raft past 5 or 7 nodes due to performance issues. However, the operator could implement some smart logic in terms how many clickhouse-keeper nodes to run and how those are spread across the regular clickhouse nodes. As a quick example:
If nodes < 3, run 1 keeper If nodes >= 3 and <= 5, run 3 keepers
And then some more comparisons can be done in terms of if the keepers can be split nicely across shards or replicas. It could even take the affinity rules into account so the nodes running keeper are in separate AZs.
Currently the docs still point to Zookeeper and also says Zookeeper is required while Clickhouse 22.3 says Clickhouse-Keeper is production ready! I do see some k8's files for Clickhouse-Keeper but that is sort of implying to run Clickhouse-Keeper cluster separately just like ZK.
I am looking to run clickhouse cluster along with clickhouse keeper embedded into some of the Clickhouse nodes in the cluster. Would be great to have an example using k8's directly or through clickhouse operator?