Closed vigodeltoro closed 1 year ago
I think it's probably that with #1051 , the operator doesn't have permissions to create the secret, so it doesn't actually reconfigure clickhouse. I manually fixed the RBAC permissions in my environment so the operator can create secrets, and it's successfully configuring the nodes with the internode secret:
...
spec:
containers:
- env:
- name: CLICKHOUSE_INTERNODE_CLUSTER_SECRET
valueFrom:
secretKeyRef:
key: secret
name: hubble-timescape-hubble-data-auto-secret
...
(⎈|kind-kind:default) ~/p/w/kind-cilium-ce-helm-install ❯❯❯ k exec -it -n hubble-timescape chi-hubble-timescape-hubble-data-0-0-0 bash ✘ 1 main ⬆ ✭ ✱ ◼
'bash-5.1# echo CLICKHOUSE_INTERNODE_CLUSTER_SECRETT
gNUPoNMpIjE
bash-5.1# grep -R CLICKHOUSE_INTERNODE_CLUSTER_SECRET /etc/clickhouse-server/
/etc/clickhouse-server/config.d/chop-generated-remote_servers.xml: <secret from_env="CLICKHOUSE_INTERNODE_CLUSTER_SECRET" />
/etc/clickhouse-server/config.d/..2022_11_21_17_04_29.1261097742/chop-generated-remote_servers.xml: <secret from_env="CLICKHOUSE_INTERNODE_CLUSTER_SECRET" />
/etc/clickhouse-server/config.d/..data/chop-generated-remote_servers.xml: <secret from_env="CLICKHOUSE_INTERNODE_CLUSTER_SECRET" />
Hi chanchez.. thanks a lot.. I will have a look to that :) and try that out..
But I would expect that it works if I create the secret manually and use the secret reference function mentioned in
@vigodeltoro I'm not sure, because the operator is likely programmed to look for the secret when the cluster.secret
options are present, and if it's failing on RBAC to lookup the secret, it gets wedged there trying to do the lookup forever.
Hi chancez .. ah.. that could be.. thanks. I will have a look for that and come back to you :)
Hi chancez, it took me some time to test.. because our Kubernetes admin was on holiday.. The operator role has the ability to get and list the secrets..
kubectl get clusterrole clickhouse-operator-clickhouse -o yaml
--- snap ---
resources:
- secrets
verbs:
- get
- list
--- snap ---
I tried it again in a cluster deployment with plaintext secret.. so I would guess that there shouldn't be a cluster secret necessary..but no success:
Config of cluster:
--- snap----
clusters:
- name: "deployment-pv"
secret:
value: "plaintext"
layout:
shardsCount: 2
replicasCount: 2
--- snap----
Config of pod
<remote_servers>
<!-- User-specified clusters -->
<deployment-pv>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>chi-pv-log-deployment-pv-0-0</host>
<port>9000</port>
</replica>
<replica>
<host>chi-pv-log-deployment-pv-0-1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>chi-pv-log-deployment-pv-1-0</host>
<port>9000</port>
</replica>
<replica>
<host>chi-pv-log-deployment-pv-1-1</host>
<port>9000</port>
</replica>
</shard>
</deployment-pv>
I would expect that:
<remote_servers>
<!-- User-specified clusters -->
<deployment-pv>
<secret>plaintext</secret>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>chi-pv-log-deployment-pv-0-0</host>
<port>9000</port>
</replica>
<replica>
<host>chi-pv-log-deployment-pv-0-1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>chi-pv-log-deployment-pv-1-0</host>
<port>9000</port>
</replica>
<replica>
<host>chi-pv-log-deployment-pv-1-1</host>
<port>9000</port>
</replica>
</shard>
</deployment-pv>
Do you have any idea ? It would be very helpful if we could fix that.. at the moment I'am injecting the secret via configmap and restart the pods.. but that works very messy.. :(
best and thanks :)
I found the issue.. at my first tests I deployed the new Clickhouse-operator version.. but I didn't had a look to the new running pod. Because it was my last idea.. today I did.. for some reasons I don't know it respawned a old version ( 0.19.3, maybe from cluster cache.. )
Now it's working.. thanks for your help and sorry for wasting your time :/
Hi, I followed your examples to rollout a secret for inter server communication but it isn't rolled out:
Clickhouse Operator v0.20
That's my manifest:
If I log in to a pod I see:
Do I configure sth. wrong ? All versions ( plaintext, secret, auto ) didn't work for me.. Does anybody can help me out ?
Thanks a lot best regards