Altinity / clickhouse-operator

Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes
https://altinity.com
Apache License 2.0
1.86k stars 457 forks source link

Cluster service disappeared after yaml config change #1210

Closed shenzhu closed 1 year ago

shenzhu commented 1 year ago

Hey team, we are working on a poc to run ClickHouse clusters with this operator, our Kubernetes cluster is hosted on AWS EKS.

The following yaml config for ClickHouse cluster was used:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "small"
spec:
  defaults:
    templates:
      podTemplate: clickhouse
      dataVolumeClaimTemplate: data-volume-clickhouse
      logVolumeClaimTemplate: data-volume-clickhouse
  configuration:
    files:
      config.d/override_config.xml: |-
        <clickhouse>
          <logger>
            <level>information</level>
          </logger>
        </clickhouse>
    clusters:
      - name: "small-cluster"
        layout:
          shardsCount: 2
          replicasCount: 2
    zookeeper:
      nodes:
        - host: zookeeper.ch-zookeeper
          port: 2181
    users:
      default/networks/ip: "::/0"
      test/password: password
      test/networks/ip: "::/0"
      test/grants/query:
        - "GRANT SHOW ON *.*"
        - "GRANT CREATE ON *.* WITH GRANT OPTION"
        - "GRANT SELECT ON system.*"
-    admin/password: admin
+    admin/password: password
      admin/networks/ip: "::/0"
      admin/grants/query:
        - "GRANT SHOW ON *.*"
        - "GRANT CREATE ON *.* WITH GRANT OPTION"
        - "GRANT SELECT ON system.*"
  templates:
    podTemplates:
      - name: clickhouse
        spec:
          securityContext:
            runAsUser: 101
            runAsGroup: 101
            fsGroup: 101
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:23.3.8
              ports:
                - name: http
                  containerPort: 8123
                - name: client
                  containerPort: 9000
                - name: interserver
                  containerPort: 9009
              volumeMounts:
                - name: data-volume-clickhouse
                  mountPath: /var/lib/clickhouse
                - name: data-volume-clickhouse
                  mountPath: /var/log/clickhouse-server
          nodeSelector:
            node.kubernetes.io/instance-type: m5.xlarge
    volumeClaimTemplates:
      - name: data-volume-clickhouse
        reclaimPolicy: Retain
        spec:
          storageClassName: ebs-csi
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi

After a small change to the password of admin account(see the changes above), the service for this ClickHouse cluster disappeared

➜ kubectl get all -n clickhouse-operator
NAME                                       READY   STATUS    RESTARTS   AGE
pod/chi-small-small-cluster-0-0-0          2/2     Running   0          44h
pod/chi-small-small-cluster-0-1-0          2/2     Running   0          44h
pod/chi-small-small-cluster-1-0-0          2/2     Running   0          44h
pod/chi-small-small-cluster-1-1-0          2/2     Running   0          44h
pod/clickhouse-operator-7b7fb5dc7b-qqfrv   2/2     Running   0          12d

NAME                                  TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
service/chi-small-small-cluster-0-0   ClusterIP   None           <none>        9000/TCP,8123/TCP,9009/TCP   3d3h
service/chi-small-small-cluster-0-1   ClusterIP   None           <none>        9000/TCP,8123/TCP,9009/TCP   3d3h
service/chi-small-small-cluster-1-0   ClusterIP   None           <none>        9000/TCP,8123/TCP,9009/TCP   3d3h
service/chi-small-small-cluster-1-1   ClusterIP   None           <none>        9000/TCP,8123/TCP,9009/TCP   3d3h
service/clickhouse-operator-metrics   ClusterIP   192.168.0.78   <none>        8888/TCP                     13d

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/clickhouse-operator   1/1     1            1           13d

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/clickhouse-operator-7b7fb5dc7b   1         1         1       13d

NAME                                           READY   AGE
statefulset.apps/chi-small-small-cluster-0-0   1/1     3d3h
statefulset.apps/chi-small-small-cluster-0-1   1/1     3d3h
statefulset.apps/chi-small-small-cluster-1-0   1/1     3d3h
statefulset.apps/chi-small-small-cluster-1-1   1/1     3d3h

Also tried the fetching logs from the operator pod with command kubectl logs clickhouse-operator-7b7fb5dc7b-qqfrv -n clickhouse-operator --since=30m

I0731 00:27:38.972777       1 controller.go:565] ENQUEUE new ReconcileCHI cmd=update for clickhouse-operator/small
I0731 00:27:38.984358       1 worker.go:379] worker.go:379:updateCHI():start:clickhouse-operator/small
I0731 00:27:38.996889       1 worker.go:421] Operator IPs. Previous: 10.221.187.195 Cur: 10.221.187.195
I0731 00:27:38.996928       1 worker.go:430] Operator IPs are the same. It is restart on the same IP
I0731 00:27:38.998391       1 worker.go:438] Operator is not just started. May not be clean restart
I0731 00:27:39.025882       1 worker-reconciler.go:51] worker-reconciler.go:51:reconcileCHI():start:clickhouse-operator/small
I0731 00:27:39.025943       1 worker-reconciler.go:55] reconcileCHI():clickhouse-operator/small:has ancestor, use it as a base for reconcile
I0731 00:27:39.076150       1 worker.go:289] clickhouse-operator/small/a6b301b0-0e38-45e5-90a2-1a151b0243ba:IPs of the CHI clickhouse-operator/small: [10.221.139.87 10.221.141.101 10.221.152.36 10.221.134.147]
I0731 00:27:39.114031       1 worker.go:289] clickhouse-operator/small/53db23f8-4240-461b-9375-4bd840a39118:IPs of the CHI clickhouse-operator/small: [10.221.139.87 10.221.141.101 10.221.152.36 10.221.134.147]
I0731 00:27:39.137849       1 worker.go:519] ActionPlan start---------------------------------------------:
AP item start -------------------------
removed spec items: 4
ap item path [0]:'.Configuration.Users["default/networks/ip"].vector[8]'
ap item value[0]:'"10.221.146.0"'
ap item path [1]:'.Configuration.Users["default/networks/ip"].vector[9]'
ap item value[1]:'"10.221.146.165"'
ap item path [2]:'.Configuration.Users["default/networks/ip"].vector[10]'
ap item value[2]:'"::/0"'
ap item path [3]:'.Configuration.Users["default/networks/ip"].vector[7]'
ap item value[3]:'"10.221.168.93"'
AP item end -------------------------AP item start -------------------------
modified spec items: 3
ap item path [0]:'.Configuration.Users["admin/password_sha256_hex"].scalar'
ap item value[0]:'"5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8"'
ap item path [1]:'.TaskID'
ap item value[1]:'"04a5fa26-4934-4bf9-a93f-b1b0364db208"'
ap item path [2]:'.Configuration.Users["default/networks/ip"].vector[6]'
ap item value[2]:'"::/0"'
AP item end -------------------------
ActionPlan end---------------------------------------------
I0731 00:27:39.137958       1 worker-reconciler.go:69] reconcileCHI():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:ActionPlan has actions - continue reconcile
I0731 00:27:39.481431       1 worker.go:597] markReconcileStart():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:reconcile started, task id: 04a5fa26-4934-4bf9-a93f-b1b0364db208
I0731 00:27:41.261306       1 worker.go:742] FOUND host: ns:clickhouse-operator|chi:small|clu:small-cluster|sha:0|rep:0|host:0-0
I0731 00:27:41.261336       1 worker.go:742] FOUND host: ns:clickhouse-operator|chi:small|clu:small-cluster|sha:0|rep:1|host:0-1
I0731 00:27:41.261350       1 worker.go:742] FOUND host: ns:clickhouse-operator|chi:small|clu:small-cluster|sha:1|rep:0|host:1-0
I0731 00:27:41.261364       1 worker.go:742] FOUND host: ns:clickhouse-operator|chi:small|clu:small-cluster|sha:1|rep:1|host:1-1
I0731 00:27:41.261424       1 worker.go:770] RemoteServersGeneratorOptions: exclude hosts: [], attributes: status: , add: true, remove: false, modify: false, found: false, excluded host(s): 
I0731 00:27:41.451998       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:27:42.657332       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-usersd
I0731 00:27:43.688741       1 creator.go:113] CreateServiceCluster():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/cluster-small-small-cluster
I0731 00:27:43.702621       1 worker-reconciler.go:566] PDB updated clickhouse-operator/small-small-cluster
I0731 00:27:43.702716       1 creator.go:136] CreateServiceShard():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/shard-small-small-cluster-0
I0731 00:27:43.702772       1 worker-reconciler.go:498] reconcileHost():Reconcile Host 0-0 started
I0731 00:27:44.680842       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-0-0 use custom template: clickhouse
I0731 00:27:44.681342       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-0-0
W0731 00:27:44.681371       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:44.681449       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:44.681530       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:27:44.685384       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-0-0:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-0-0
I0731 00:27:44.685492       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 0-0
I0731 00:27:44.685555       1 worker.go:976] shouldExcludeHost():The same host would not be updated host 0 shard 0 cluster small-cluster
I0731 00:27:44.851510       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-deploy-confd-small-cluster-0-0
I0731 00:27:47.654714       1 cluster.go:84] Run query on: chi-small-small-cluster-0-0.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-0-0.clickhouse-operator.svc.cluster.local]
I0731 00:27:47.677452       1 worker-reconciler.go:292] reconcileHostStatefulSet():Reconcile host 0-0. ClickHouse version: 23.3.8.21
I0731 00:27:47.677622       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 0-0
I0731 00:27:47.677651       1 worker-reconciler.go:304] reconcileHostStatefulSet():Reconcile host 0-0. Reconcile StatefulSet
I0731 00:27:47.677752       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-0-0 use custom template: clickhouse
I0731 00:27:47.677868       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-0-0
W0731 00:27:47.677885       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:47.677895       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:47.677913       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:27:47.678999       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-0-0:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-0-0
I0731 00:27:49.452987       1 creator.go:160] CreateServiceHost():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/chi-small-small-cluster-0-0 for Set chi-small-small-cluster-0-0
I0731 00:27:49.653751       1 worker.go:1201] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/chi-small-small-cluster-0-0
I0731 00:27:50.683030       1 worker.go:800] migrateTables():No need to add tables on host 0 to shard 0 in cluster small-cluster
I0731 00:27:50.683096       1 cluster.go:84] Run query on: chi-small-small-cluster-0-0.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-0-0.clickhouse-operator.svc.cluster.local]
I0731 00:27:50.691506       1 worker.go:885] includeHost():Include into cluster host 0 shard 0 cluster small-cluster
I0731 00:27:50.691593       1 worker.go:770] RemoteServersGeneratorOptions: exclude hosts: [], attributes: status: , add: true, remove: false, modify: false, found: false, excluded host(s): 
I0731 00:27:50.851412       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:27:52.052760       1 worker-reconciler.go:537] reconcileHost():Reconcile Host 0-0 completed. ClickHouse version running: 23.3.8.21
I0731 00:27:53.093114       1 worker-reconciler.go:553] reconcileHost():ProgressHostsCompleted: 1 of 4
I0731 00:27:54.078428       1 creator.go:58] CreateServiceCHI():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/clickhouse-small
I0731 00:27:54.078773       1 worker.go:1164] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:reuse Port 8123 values
I0731 00:27:54.078802       1 worker.go:1164] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:reuse Port 9000 values
E0731 00:27:54.258470       1 worker.go:1207] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/clickhouse-small failed with error Service "clickhouse-small" is invalid: spec.loadBalancerClass: Invalid value: "null": may not change once set
I0731 00:27:55.660814       1 deleter.go:327] clickhouse-operator/clickhouse-small:OK delete Service clickhouse-operator/clickhouse-small
E0731 00:27:55.858509       1 worker.go:1232] createService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Create Service clickhouse-operator/clickhouse-small failed with error Service "clickhouse-small" is invalid: spec.clusterIPs: Invalid value: []string{"192.168.25.164"}: failed to allocate IP 192.168.25.164: provided IP is already allocated
E0731 00:27:56.875412       1 worker-reconciler.go:661] reconcileService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:FAILED to reconcile Service: clickhouse-small CHI: small 
I0731 00:27:57.884052       1 worker-reconciler.go:498] reconcileHost():Reconcile Host 0-1 started
I0731 00:27:58.883456       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-0-1 use custom template: clickhouse
I0731 00:27:58.883619       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-0-1
W0731 00:27:58.883643       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:58.883741       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:27:58.883775       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:27:58.885180       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-0-1:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-0-1
I0731 00:27:58.885289       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 0-1
I0731 00:27:58.885326       1 worker.go:976] shouldExcludeHost():The same host would not be updated host 1 shard 0 cluster small-cluster
I0731 00:27:59.049373       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-deploy-confd-small-cluster-0-1
I0731 00:28:01.856959       1 cluster.go:84] Run query on: chi-small-small-cluster-0-1.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-0-1.clickhouse-operator.svc.cluster.local]
I0731 00:28:01.873235       1 worker-reconciler.go:292] reconcileHostStatefulSet():Reconcile host 0-1. ClickHouse version: 23.3.8.21
I0731 00:28:01.873421       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 0-1
I0731 00:28:01.873454       1 worker-reconciler.go:304] reconcileHostStatefulSet():Reconcile host 0-1. Reconcile StatefulSet
I0731 00:28:01.873587       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-0-1 use custom template: clickhouse
I0731 00:28:01.873741       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-0-1
W0731 00:28:01.873763       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:01.873780       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:01.875332       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:28:01.879590       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-0-1:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-0-1
I0731 00:28:03.656715       1 creator.go:160] CreateServiceHost():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/chi-small-small-cluster-0-1 for Set chi-small-small-cluster-0-1
I0731 00:28:03.862537       1 worker.go:1201] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/chi-small-small-cluster-0-1
I0731 00:28:04.884155       1 worker.go:800] migrateTables():No need to add tables on host 1 to shard 0 in cluster small-cluster
I0731 00:28:04.884208       1 cluster.go:84] Run query on: chi-small-small-cluster-0-1.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-0-1.clickhouse-operator.svc.cluster.local]
I0731 00:28:04.893853       1 worker.go:885] includeHost():Include into cluster host 1 shard 0 cluster small-cluster
I0731 00:28:04.893881       1 worker.go:770] RemoteServersGeneratorOptions: exclude hosts: [], attributes: status: , add: true, remove: false, modify: false, found: false, excluded host(s): 
I0731 00:28:05.054595       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:28:06.260788       1 worker-reconciler.go:537] reconcileHost():Reconcile Host 0-1 completed. ClickHouse version running: 23.3.8.21
I0731 00:28:07.277021       1 worker-reconciler.go:553] reconcileHost():ProgressHostsCompleted: 2 of 4
I0731 00:28:08.280860       1 creator.go:136] CreateServiceShard():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/shard-small-small-cluster-1
I0731 00:28:08.280934       1 worker-reconciler.go:498] reconcileHost():Reconcile Host 1-0 started
I0731 00:28:09.277733       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-1-0 use custom template: clickhouse
I0731 00:28:09.278134       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-1-0
W0731 00:28:09.278176       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:09.278195       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:09.278223       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:28:09.279835       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-1-0:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-1-0
I0731 00:28:09.280002       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 1-0
I0731 00:28:09.280036       1 worker.go:976] shouldExcludeHost():The same host would not be updated host 0 shard 1 cluster small-cluster
I0731 00:28:09.457052       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-deploy-confd-small-cluster-1-0
I0731 00:28:12.253413       1 cluster.go:84] Run query on: chi-small-small-cluster-1-0.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-1-0.clickhouse-operator.svc.cluster.local]
I0731 00:28:12.268758       1 worker-reconciler.go:292] reconcileHostStatefulSet():Reconcile host 1-0. ClickHouse version: 23.3.8.21
I0731 00:28:12.268866       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 1-0
I0731 00:28:12.268897       1 worker-reconciler.go:304] reconcileHostStatefulSet():Reconcile host 1-0. Reconcile StatefulSet
I0731 00:28:12.269040       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-1-0 use custom template: clickhouse
I0731 00:28:12.269203       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-1-0
W0731 00:28:12.269229       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:12.269244       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:12.269268       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:28:12.270487       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-1-0:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-1-0
I0731 00:28:14.054779       1 creator.go:160] CreateServiceHost():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/chi-small-small-cluster-1-0 for Set chi-small-small-cluster-1-0
I0731 00:28:14.256014       1 worker.go:1201] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/chi-small-small-cluster-1-0
I0731 00:28:15.279187       1 worker.go:800] migrateTables():No need to add tables on host 0 to shard 1 in cluster small-cluster
I0731 00:28:15.279241       1 cluster.go:84] Run query on: chi-small-small-cluster-1-0.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-1-0.clickhouse-operator.svc.cluster.local]
I0731 00:28:15.284992       1 worker.go:885] includeHost():Include into cluster host 0 shard 1 cluster small-cluster
I0731 00:28:15.285016       1 worker.go:770] RemoteServersGeneratorOptions: exclude hosts: [], attributes: status: , add: true, remove: false, modify: false, found: false, excluded host(s): 
I0731 00:28:15.452334       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:28:16.656793       1 worker-reconciler.go:537] reconcileHost():Reconcile Host 1-0 completed. ClickHouse version running: 23.3.8.21
I0731 00:28:17.683348       1 worker-reconciler.go:553] reconcileHost():ProgressHostsCompleted: 3 of 4
I0731 00:28:18.685551       1 worker-reconciler.go:498] reconcileHost():Reconcile Host 1-1 started
I0731 00:28:19.688692       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-1-1 use custom template: clickhouse
I0731 00:28:19.688844       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-1-1
W0731 00:28:19.688857       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:19.688865       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:19.688881       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:28:19.690047       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-1-1:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-1-1
I0731 00:28:19.690174       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 1-1
I0731 00:28:19.690286       1 worker.go:976] shouldExcludeHost():The same host would not be updated host 1 shard 1 cluster small-cluster
I0731 00:28:19.850760       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-deploy-confd-small-cluster-1-1
I0731 00:28:22.654336       1 cluster.go:84] Run query on: chi-small-small-cluster-1-1.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-1-1.clickhouse-operator.svc.cluster.local]
I0731 00:28:22.666019       1 worker-reconciler.go:292] reconcileHostStatefulSet():Reconcile host 1-1. ClickHouse version: 23.3.8.21
I0731 00:28:22.666131       1 worker.go:134] shouldForceRestartHost():Force restart is not required. Host: 1-1
I0731 00:28:22.666192       1 worker-reconciler.go:304] reconcileHostStatefulSet():Reconcile host 1-1. Reconcile StatefulSet
I0731 00:28:22.666487       1 creator.go:589] getPodTemplate():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:statefulSet chi-small-small-cluster-1-1 use custom template: clickhouse
I0731 00:28:22.666668       1 creator.go:575] setupLogContainer():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add log container for statefulSet chi-small-small-cluster-1-1
W0731 00:28:22.666689       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:22.666711       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse volumeMount.Name:data-volume-clickhouse already used
W0731 00:28:22.666732       1 creator.go:994] containerAppendVolumeMount():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:container.Name:clickhouse-log volumeMount.Name:data-volume-clickhouse already used
I0731 00:28:22.667709       1 worker.go:1277] getStatefulSetStatus():clickhouse-operator/chi-small-small-cluster-1-1:cur and new StatefulSets ARE EQUAL based on labels. No StatefulSet reconcile is required for: clickhouse-operator/chi-small-small-cluster-1-1
I0731 00:28:24.452173       1 creator.go:160] CreateServiceHost():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:clickhouse-operator/chi-small-small-cluster-1-1 for Set chi-small-small-cluster-1-1
I0731 00:28:24.651974       1 worker.go:1201] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/chi-small-small-cluster-1-1
I0731 00:28:25.688894       1 worker.go:800] migrateTables():No need to add tables on host 1 to shard 1 in cluster small-cluster
I0731 00:28:25.688951       1 cluster.go:84] Run query on: chi-small-small-cluster-1-1.clickhouse-operator.svc.cluster.local of [chi-small-small-cluster-1-1.clickhouse-operator.svc.cluster.local]
I0731 00:28:25.697995       1 worker.go:885] includeHost():Include into cluster host 1 shard 1 cluster small-cluster
I0731 00:28:25.698015       1 worker.go:770] RemoteServersGeneratorOptions: exclude hosts: [], attributes: status: , add: true, remove: false, modify: false, found: false, excluded host(s): 
I0731 00:28:25.851927       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:28:27.051398       1 worker-reconciler.go:537] reconcileHost():Reconcile Host 1-1 completed. ClickHouse version running: 23.3.8.21
I0731 00:28:28.079246       1 worker-reconciler.go:553] reconcileHost():ProgressHostsCompleted: 4 of 4
I0731 00:28:29.252453       1 worker.go:1089] updateConfigMap():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update ConfigMap clickhouse-operator/chi-small-common-configd
I0731 00:28:30.283238       1 worker-reconciler.go:103] reconcileCHI():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:remove items scheduled for deletion
I0731 00:28:31.281336       1 worker-deleter.go:37] clean():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Failed to reconcile objects:
Service: clickhouse-operator/clickhouse-small
I0731 00:28:31.281392       1 worker-deleter.go:38] clean():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Reconciled objects:
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-1-1
ConfigMap: clickhouse-operator/chi-small-common-configd
ConfigMap: clickhouse-operator/chi-small-common-usersd
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-0-0
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-0-1
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-1-0
PDB: clickhouse-operator/small-small-cluster
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-0-0-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-0-1-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-1-0-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-1-1-0
StatefulSet: clickhouse-operator/chi-small-small-cluster-0-0
StatefulSet: clickhouse-operator/chi-small-small-cluster-0-1
StatefulSet: clickhouse-operator/chi-small-small-cluster-1-0
StatefulSet: clickhouse-operator/chi-small-small-cluster-1-1
Service: clickhouse-operator/chi-small-small-cluster-0-0
Service: clickhouse-operator/chi-small-small-cluster-0-1
Service: clickhouse-operator/chi-small-small-cluster-1-0
Service: clickhouse-operator/chi-small-small-cluster-1-1
I0731 00:28:32.060765       1 worker-deleter.go:41] clean():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Existing objects:
StatefulSet: clickhouse-operator/chi-small-small-cluster-1-1
StatefulSet: clickhouse-operator/chi-small-small-cluster-0-0
StatefulSet: clickhouse-operator/chi-small-small-cluster-0-1
StatefulSet: clickhouse-operator/chi-small-small-cluster-1-0
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-0-0
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-0-1
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-1-0
ConfigMap: clickhouse-operator/chi-small-deploy-confd-small-cluster-1-1
ConfigMap: clickhouse-operator/chi-small-common-configd
ConfigMap: clickhouse-operator/chi-small-common-usersd
Service: clickhouse-operator/chi-small-small-cluster-0-0
Service: clickhouse-operator/chi-small-small-cluster-0-1
Service: clickhouse-operator/chi-small-small-cluster-1-0
Service: clickhouse-operator/chi-small-small-cluster-1-1
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-0-0-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-0-1-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-1-0-0
PVC: clickhouse-operator/data-volume-clickhouse-chi-small-small-cluster-1-1-0
PDB: clickhouse-operator/small-small-cluster
I0731 00:28:32.060884       1 worker-deleter.go:43] clean():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Non-reconciled objects:
I0731 00:28:32.060917       1 worker-deleter.go:53] clean():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:remove items scheduled for deletion
I0731 00:28:33.084190       1 worker-deleter.go:64] worker-deleter.go:64:dropReplicas():start:clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:drop replicas based on AP
I0731 00:28:33.084360       1 worker-deleter.go:81] worker-deleter.go:81:dropReplicas():end:clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:processed replicas: 0
I0731 00:28:33.084388       1 worker.go:573] includeStopped():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:add CHI to monitoring
I0731 00:28:34.080830       1 controller.go:609] OK update watch (clickhouse-operator/small): {"namespace":"clickhouse-operator","name":"small","clusters":[{"name":"small-cluster","hosts":[{"name":"0-0","hostname":"chi-small-small-cluster-0-0.clickhouse-operator.svc.cluster.local","tcpPort":9000,"httpPort":8123},{"name":"0-1","hostname":"chi-small-small-cluster-0-1.clickhouse-operator.svc.cluster.local","tcpPort":9000,"httpPort":8123},{"name":"1-0","hostname":"chi-small-small-cluster-1-0.clickhouse-operator.svc.cluster.local","tcpPort":9000,"httpPort":8123},{"name":"1-1","hostname":"chi-small-small-cluster-1-1.clickhouse-operator.svc.cluster.local","tcpPort":9000,"httpPort":8123}]}]}
I0731 00:28:34.086936       1 worker.go:540] clickhouse-operator/small:all IP addresses are in place
I0731 00:28:34.849395       1 worker.go:611] clickhouse-operator/small/180646f1-c931-4e3b-a3f3-65139d017c63:IPs of the CHI-2 [10.221.139.87 10.221.141.101 10.221.152.36 10.221.134.147]
I0731 00:28:34.858428       1 worker.go:615] clickhouse-operator/small/17213228-21ab-419f-a507-521ae8b500ff:Update users IPS-2
I0731 00:28:35.728260       1 worker.go:636] finalizeReconcileAndMarkCompleted():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:reconcile completed successfully, task id: 04a5fa26-4934-4bf9-a93f-b1b0364db208
I0731 00:28:36.677452       1 worker-reconciler.go:111] worker-reconciler.go:52:reconcileCHI():end:clickhouse-operator/small
I0731 00:28:36.677742       1 worker.go:414] worker.go:380:updateCHI():end:clickhouse-operator/small
Slach commented 1 year ago

Do you mean service clickhouse-{chi-name}-{cluster-name} didn't created? I see in logs

E0731 00:27:54.258470       1 worker.go:1207] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/clickhouse-small 
failed with error Service "clickhouse-small" is invalid: spec.loadBalancerClass: Invalid value: "null": may not change once set

Which clichouse-operator version do you use? did you sync CRDS? Could you share?

kubectl get deploy -l app=clickhouse-operator -o yaml
kubectl get crd clickhouseinstallations.clickhouse.altinity.com -o jsonpath='{.metadata.labels}'

Your password changes are not related to any kubernetes services clickhouse-operator deployment use clickhouse_operator user by default to interaction with clickhouse-server pods

default/networks/ip: "::/0" better will not open default user without a password from any IP

shenzhu commented 1 year ago

Hey @Slach , thanks for your reply!

Yeah, at first when we apply the clickhouse-cluster.yaml, the service for the cluster clickhouse-{chi-name}-{cluster-name} can be created successfully, but after I made the password change and use kubectl apply -f clickhouse-cluster.yaml -n clickhouse-operator to apply the changes, the service for the cluster disappeared.

And as the log says, I can find some error logs related to the service, but not sure if it's related to our EKS setup or not

E0731 00:27:54.258470       1 worker.go:1207] updateService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Update Service clickhouse-operator/clickhouse-small failed with error Service "clickhouse-small" is invalid: spec.loadBalancerClass: Invalid value: "null": may not change once set
I0731 00:27:55.660814       1 deleter.go:327] clickhouse-operator/clickhouse-small:OK delete Service clickhouse-operator/clickhouse-small
E0731 00:27:55.858509       1 worker.go:1232] createService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:Create Service clickhouse-operator/clickhouse-small failed with error Service "clickhouse-small" is invalid: spec.clusterIPs: Invalid value: []string{"192.168.25.164"}: failed to allocate IP 192.168.25.164: provided IP is already allocated
E0731 00:27:56.875412       1 worker-reconciler.go:661] reconcileService():clickhouse-operator/small/04a5fa26-4934-4bf9-a93f-b1b0364db208:FAILED to reconcile Service: clickhouse-small CHI: small 

Here're the commands and output

  1. kubectl get deploy -l app=clickhouse-operator -o yaml -n clickhouse-operator

    apiVersion: v1
    items:
    - apiVersion: apps/v1
    kind: Deployment
    metadata:
    annotations:
      deployment.kubernetes.io/revision: "1"
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"clickhouse-operator","clickhouse.altinity.com/chop":"0.21.2"},"name":"clickhouse-operator","namespace":"clickhouse-operator"},"spec":{"replicas":1,"selector":{"matchLabels":{"app":"clickhouse-operator"}},"template":{"metadata":{"annotations":{"prometheus.io/port":"8888","prometheus.io/scrape":"true"},"labels":{"app":"clickhouse-operator"}},"spec":{"containers":[{"env":[{"name":"OPERATOR_POD_NODE_NAME","valueFrom":{"fieldRef":{"fieldPath":"spec.nodeName"}}},{"name":"OPERATOR_POD_NAME","valueFrom":{"fieldRef":{"fieldPath":"metadata.name"}}},{"name":"OPERATOR_POD_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}},{"name":"OPERATOR_POD_IP","valueFrom":{"fieldRef":{"fieldPath":"status.podIP"}}},{"name":"OPERATOR_POD_SERVICE_ACCOUNT","valueFrom":{"fieldRef":{"fieldPath":"spec.serviceAccountName"}}},{"name":"OPERATOR_CONTAINER_CPU_REQUEST","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"requests.cpu"}}},{"name":"OPERATOR_CONTAINER_CPU_LIMIT","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"limits.cpu"}}},{"name":"OPERATOR_CONTAINER_MEM_REQUEST","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"requests.memory"}}},{"name":"OPERATOR_CONTAINER_MEM_LIMIT","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"limits.memory"}}}],"image":"altinity/clickhouse-operator:latest","imagePullPolicy":null,"name":"clickhouse-operator","volumeMounts":[{"mountPath":"/etc/clickhouse-operator","name":"etc-clickhouse-operator-folder"},{"mountPath":"/etc/clickhouse-operator/conf.d","name":"etc-clickhouse-operator-confd-folder"},{"mountPath":"/etc/clickhouse-operator/config.d","name":"etc-clickhouse-operator-configd-folder"},{"mountPath":"/etc/clickhouse-operator/templates.d","name":"etc-clickhouse-operator-templatesd-folder"},{"mountPath":"/etc/clickhouse-operator/users.d","name":"etc-clickhouse-operator-usersd-folder"}]},{"env":[{"name":"OPERATOR_POD_NODE_NAME","valueFrom":{"fieldRef":{"fieldPath":"spec.nodeName"}}},{"name":"OPERATOR_POD_NAME","valueFrom":{"fieldRef":{"fieldPath":"metadata.name"}}},{"name":"OPERATOR_POD_NAMESPACE","valueFrom":{"fieldRef":{"fieldPath":"metadata.namespace"}}},{"name":"OPERATOR_POD_IP","valueFrom":{"fieldRef":{"fieldPath":"status.podIP"}}},{"name":"OPERATOR_POD_SERVICE_ACCOUNT","valueFrom":{"fieldRef":{"fieldPath":"spec.serviceAccountName"}}},{"name":"OPERATOR_CONTAINER_CPU_REQUEST","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"requests.cpu"}}},{"name":"OPERATOR_CONTAINER_CPU_LIMIT","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"limits.cpu"}}},{"name":"OPERATOR_CONTAINER_MEM_REQUEST","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"requests.memory"}}},{"name":"OPERATOR_CONTAINER_MEM_LIMIT","valueFrom":{"resourceFieldRef":{"containerName":"clickhouse-operator","resource":"limits.memory"}}}],"image":"altinity/metrics-exporter:latest","imagePullPolicy":null,"name":"metrics-exporter","ports":[{"containerPort":8888,"name":"metrics"}],"volumeMounts":[{"mountPath":"/etc/clickhouse-operator","name":"etc-clickhouse-operator-folder"},{"mountPath":"/etc/clickhouse-operator/conf.d","name":"etc-clickhouse-operator-confd-folder"},{"mountPath":"/etc/clickhouse-operator/config.d","name":"etc-clickhouse-operator-configd-folder"},{"mountPath":"/etc/clickhouse-operator/templates.d","name":"etc-clickhouse-operator-templatesd-folder"},{"mountPath":"/etc/clickhouse-operator/users.d","name":"etc-clickhouse-operator-usersd-folder"}]}],"serviceAccountName":"clickhouse-operator","volumes":[{"configMap":{"name":"etc-clickhouse-operator-files"},"name":"etc-clickhouse-operator-folder"},{"configMap":{"name":"etc-clickhouse-operator-confd-files"},"name":"etc-clickhouse-operator-confd-folder"},{"configMap":{"name":"etc-clickhouse-operator-configd-files"},"name":"etc-clickhouse-operator-configd-folder"},{"configMap":{"name":"etc-clickhouse-operator-templatesd-files"},"name":"etc-clickhouse-operator-templatesd-folder"},{"configMap":{"name":"etc-clickhouse-operator-usersd-files"},"name":"etc-clickhouse-operator-usersd-folder"}]}}}}
    creationTimestamp: "2023-07-17T22:00:10Z"
    generation: 1
    labels:
      app: clickhouse-operator
      clickhouse.altinity.com/app: chop
      clickhouse.altinity.com/chop: 0.21.2
      clickhouse.altinity.com/chop-commit: 32ef0fa
      clickhouse.altinity.com/chop-date: 2023-06-29T09.08.10
    name: clickhouse-operator
    namespace: clickhouse-operator
    resourceVersion: "105804355"
    uid: cf3a2bae-4846-4aae-b191-452eec79ca3e
    spec:
    progressDeadlineSeconds: 600
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: clickhouse-operator
    strategy:
      rollingUpdate:
        maxSurge: 25%
        maxUnavailable: 25%
      type: RollingUpdate
    template:
      metadata:
        annotations:
          prometheus.io/port: "8888"
          prometheus.io/scrape: "true"
        creationTimestamp: null
        labels:
          app: clickhouse-operator
      spec:
        containers:
        - env:
          - name: OPERATOR_POD_NODE_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.nodeName
          - name: OPERATOR_POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: OPERATOR_POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: OPERATOR_POD_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
          - name: OPERATOR_POD_SERVICE_ACCOUNT
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.serviceAccountName
          - name: OPERATOR_CONTAINER_CPU_REQUEST
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: requests.cpu
          - name: OPERATOR_CONTAINER_CPU_LIMIT
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: limits.cpu
          - name: OPERATOR_CONTAINER_MEM_REQUEST
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: requests.memory
          - name: OPERATOR_CONTAINER_MEM_LIMIT
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: limits.memory
          image: altinity/clickhouse-operator:latest
          imagePullPolicy: Always
          name: clickhouse-operator
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /etc/clickhouse-operator
            name: etc-clickhouse-operator-folder
          - mountPath: /etc/clickhouse-operator/conf.d
            name: etc-clickhouse-operator-confd-folder
          - mountPath: /etc/clickhouse-operator/config.d
            name: etc-clickhouse-operator-configd-folder
          - mountPath: /etc/clickhouse-operator/templates.d
            name: etc-clickhouse-operator-templatesd-folder
          - mountPath: /etc/clickhouse-operator/users.d
            name: etc-clickhouse-operator-usersd-folder
        - env:
          - name: OPERATOR_POD_NODE_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.nodeName
          - name: OPERATOR_POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: OPERATOR_POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: OPERATOR_POD_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
          - name: OPERATOR_POD_SERVICE_ACCOUNT
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.serviceAccountName
          - name: OPERATOR_CONTAINER_CPU_REQUEST
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: requests.cpu
          - name: OPERATOR_CONTAINER_CPU_LIMIT
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: limits.cpu
          - name: OPERATOR_CONTAINER_MEM_REQUEST
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: requests.memory
          - name: OPERATOR_CONTAINER_MEM_LIMIT
            valueFrom:
              resourceFieldRef:
                containerName: clickhouse-operator
                divisor: "0"
                resource: limits.memory
          image: altinity/metrics-exporter:latest
          imagePullPolicy: Always
          name: metrics-exporter
          ports:
          - containerPort: 8888
            name: metrics
            protocol: TCP
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /etc/clickhouse-operator
            name: etc-clickhouse-operator-folder
          - mountPath: /etc/clickhouse-operator/conf.d
            name: etc-clickhouse-operator-confd-folder
          - mountPath: /etc/clickhouse-operator/config.d
            name: etc-clickhouse-operator-configd-folder
          - mountPath: /etc/clickhouse-operator/templates.d
            name: etc-clickhouse-operator-templatesd-folder
          - mountPath: /etc/clickhouse-operator/users.d
            name: etc-clickhouse-operator-usersd-folder
        dnsPolicy: ClusterFirst
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext: {}
        serviceAccount: clickhouse-operator
        serviceAccountName: clickhouse-operator
        terminationGracePeriodSeconds: 30
        terminationGracePeriodSeconds: 30
        volumes:
        - configMap:
            defaultMode: 420
            name: etc-clickhouse-operator-files
          name: etc-clickhouse-operator-folder
        - configMap:
            defaultMode: 420
            name: etc-clickhouse-operator-confd-files
          name: etc-clickhouse-operator-confd-folder
        - configMap:
            defaultMode: 420
            name: etc-clickhouse-operator-configd-files
          name: etc-clickhouse-operator-configd-folder
        - configMap:
            defaultMode: 420
            name: etc-clickhouse-operator-templatesd-files
          name: etc-clickhouse-operator-templatesd-folder
        - configMap:
            defaultMode: 420
            name: etc-clickhouse-operator-usersd-files
          name: etc-clickhouse-operator-usersd-folder
    status:
    availableReplicas: 1
    conditions:
    - lastTransitionTime: "2023-07-17T22:00:10Z"
      lastUpdateTime: "2023-07-17T22:00:15Z"
      message: ReplicaSet "clickhouse-operator-7b7fb5dc7b" has successfully progressed.
      reason: NewReplicaSetAvailable
      status: "True"
      type: Progressing
    - lastTransitionTime: "2023-07-18T15:16:55Z"
      lastUpdateTime: "2023-07-18T15:16:55Z"
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: "True"
      type: Available
    observedGeneration: 1
    readyReplicas: 1
    replicas: 1
    updatedReplicas: 1
    kind: List
    metadata:
    resourceVersion: ""
  2. kubectl get crd clickhouseinstallations.clickhouse.altinity.com -o jsonpath='{.metadata.labels}'

    {"clickhouse.altinity.com/chop":"0.21.2"}%
hodgesrm commented 1 year ago

Just curious @shenzhu, is the indentation of admin/password correct in the original YAML or is that an artifact of the diff?

hodgesrm commented 1 year ago

@sunsingerus ^ any ideas?

Slach commented 1 year ago

@shenzhu are you sure you have only one instance of clickhouse-operator? could you share kubectl get deploy --all-namespaces | grep operator ?

Slach commented 1 year ago

do you have any ClickHouseInstallationTemplate resources?

kubectl get chit --all-namespaces?

sunsingerus commented 1 year ago

@shenzhu @hodgesrm @Slach Current vision of the situation is like the following. Post-mortem

  1. There is some kind of default LoadBalancerClass assigned to a service created in EKS.
  2. Operator do not migrate this LoadBalancerClass at the moment - this behavior is a subject of modification - and tries to update service with default 'missing' value
  3. When update fails, operator tries to recreate the service. In this case operator deletes existing service and tries to re-create copy of the previous one, with internal fields specified (including IP address) - this behavior is a subject of modification as well
  4. This operation fails, which we can see in logs due to IP address being already assigned.
  5. Operator fails and gives ups. That's why service is deleted :( Good news - behavior is updated in 0.22.0 version

@shenzhu please try with 0.22.0 version - it is already available

shenzhu commented 1 year ago

Hey team, thanks so much for your help! We will try the new version 0.22.0.

On our side, we tried some workarounds for this issue:

Option 1 We are operating Kubernetes in AWS EKS, and are using AWS Load Balancer Controller. The first fix we tried is to specify the loadBalancerClass to service.k8s.aws/nlb, something like the following

...
    serviceTemplates:
      - name: service-clickhouse
        spec:
          loadBalancerClass: service.k8s.aws/nlb
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

Option 2 Instead of relying on Kubernetes LoadBalancer to provide external access, we tried adding a Ingress layer in front. In this case the Service doesn't has to be LoadBalancer, so we changed it to ClusterIP and updated the Ingress to connect to this ClusterIP. Something like following

...
    serviceTemplates:
      - name: service
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: clickhouse
spec:
  ingressClassName: nginx-internal
  rules:
    - host: small.clickhouse.cluster
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: clickhouse-small
                port:
                  number: 8123

cc. @hodgesrm @sunsingerus @Slach

alazyer commented 1 year ago

Hey team, thanks so much for your help! We will try the new version 0.22.0.

On our side, we tried some workarounds for this issue:

Option 1 We are operating Kubernetes in AWS EKS, and are using AWS Load Balancer Controller. The first fix we tried is to specify the loadBalancerClass to service.k8s.aws/nlb, something like the following

...
    serviceTemplates:
      - name: service-clickhouse
        spec:
          loadBalancerClass: service.k8s.aws/nlb
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

Option 2 Instead of relying on Kubernetes LoadBalancer to provide external access, we tried adding a Ingress layer in front. In this case the Service doesn't has to be LoadBalancer, so we changed it to ClusterIP and updated the Ingress to connect to this ClusterIP. Something like following

...
    serviceTemplates:
      - name: service
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: clickhouse
spec:
  ingressClassName: nginx-internal
  rules:
    - host: small.clickhouse.cluster
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: clickhouse-small
                port:
                  number: 8123

cc. @hodgesrm @sunsingerus @Slach

I wonder how to connect to the clickhouse via the exported ingress? I do the same thing and want to export clickhouse for external access, but i can't connect to the clickhouse server via the clickhouse-client -h /clickhouse as the path of my ingress is /clickhouse. Any suggestion?

shenzhu commented 1 year ago

Hi @alazyer , I think the host specified in the Ingress won't work automatically, the underlying infra needs to be updated to support it externally.

alex-zaitsev commented 1 year ago

It has been reproduced and fixed in 0.22