apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.64k stars 3.26k forks source link

[Bug] Doris 1.2.4.1版本使用k8s部署,不能发现计算节点 #19865

Open anthony-yau opened 1 year ago

anthony-yau commented 1 year ago

Search before asking

Version

1.2.4.1

What's Wrong?

fe yaml配置:

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

apiVersion: v1
kind: Service
metadata:
  name: doris-follower-cluster1
  labels:
    app: doris-follower-cluster1
spec:
  ports:
    - port: 8030
      name: http-port
    - port: 9020
      name: rpc-port
    - port: 9030
      name: query-port
    - port: 9010
      name: edit-log-port #This name should be fixed. Doris will get the port information through this name
  clusterIP: None
  selector:
    app: doris-follower-cluster1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: doris-follower-cluster1
  labels:
    app: doris-follower-cluster1
spec:
  selector:
    matchLabels:
      app: doris-follower-cluster1
  serviceName: doris-follower-cluster1
  replicas: 3
  template:
    metadata:
      name: doris-follower-cluster1
      labels:
        app: doris-follower-cluster1
    spec:
      containers:
        - name: doris-follower-cluster1
          #Need to change to real mirror information
          image: mirrors.aliyun.com/doris-fe:1.2.3
          imagePullPolicy: IfNotPresent
          env:
            #Specify the startup type as k8s to bypass some restrictions of the official image initialization script
            - name: BUILD_TYPE
              value: "k8s"
            #Initialize the fe of three nodes
            - name: FE_INIT_NUMBER
              value: "3"
            #ServiceName of bakend_cn node,(if do not have bakend_cn node,do not configure this environment variable)
            - name: CN_SERVICE
              value: "doris-cn-cluster1"
            #StatefulSetName of bakend_cn node,(if do not have bakend_cn node,do not configure this environment variable)
            - name: CN_STATEFULSET
              value: "doris-cn-cluster1"
            #ServiceName of bakend node,(if do not have bakend node,do not configure this environment variable)
            - name: BE_SERVICE
              value: "doris-be-cluster1"
            #StatefulSetName of bakend node,(if do not have bakend node,do not configure this environment variable)
            - name: BE_STATEFULSET
              value: "doris-be-cluster1"
            #ServiceName of follower node,(if do not have follower node,do not configure this environment variable)
            - name: FE_SERVICE
              value: "doris-follower-cluster1"
            ##StatefulSetName of follower node,(if do not have follower node,do not configure this environment variable)
            - name: FE_STATEFULSET
              value: "doris-follower-cluster1"
            - name: APP_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
            - containerPort: 8030
              name: http-port
            - containerPort: 9020
              name: rpc-port
            - containerPort: 9030
              name: query-port
            - containerPort: 9010
              name: edit-log-port
          volumeMounts:
            #Mount the configuration file in the way of configmap
            - name: conf
              mountPath: /opt/apache-doris/fe/conf
              #In order to call the api of k8s
            - name: kube
              mountPath: /root/.kube/config
              readOnly: true
      volumes:
        - name: conf
          configMap:
            name: follower-conf
        - name: kube
          hostPath:
            path: /root/.kube/config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: follower-conf
data:
  fe.conf: |
    # priority_networks = 172.16.0.0/24
    #It can automatically maintain node information by getting the number of replicas of StatefulSet, similar to alter system add/drop back
    enable_deploy_manager = k8s
    #Automatically adjust the IP of the node according to the domain name (for example, after the pod is restarted, the domain name is still doris-be-cluster1-0-doris-be-cluster1.default.svc.cluster.local, but the IP may change from 172.16.0.9 to 172.16.0.10)
    enable_fqdn_mode = true
    LOG_DIR = ${DORIS_HOME}/log
    sys_log_level = INFO
    http_port = 8030
    rpc_port = 9020
    query_port = 9030
    edit_log_port = 9010
    mysql_service_nio_enabled = true
    #Doris needs to generate the log4j configuration file according to the fe.yml configuration information, which is written in the same directory as fe.yml by default, but the config we mount is readonly, so specify this configuration to write the log4j file to another location
    custom_config_dir = /opt/apache-doris/
    #when set to false, the backend will not be dropped and remaining in DECOMMISSION state
    drop_backend_after_decommission = false

backend cn yaml:

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

apiVersion: v1
kind: Service
metadata:
  name: doris-cn-cluster1
  labels:
    app: doris-cn-cluster1
spec:
  ports:
    - port: 9060
      name: be-port
    - port: 8040
      name: webserver-port
    - port: 9050
      name: heartbeat-port #This name should be fixed. Doris will get the port information through this name
    - port: 8060
      name: brpc-port
  clusterIP: None
  selector:
    app: doris-cn-cluster1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: doris-cn-cluster1
  labels:
    app: doris-cn-cluster1
spec:
  selector:
    matchLabels:
      app: doris-cn-cluster1
  serviceName: doris-cn-cluster1
  replicas: 3
  template:
    metadata:
      name: doris-cn-cluster1
      labels:
        app: doris-cn-cluster1
    spec:
      containers:
        - name: doris-cn-cluster1
          #Need to change to real mirror information
          image: apache-doris-be:test
          imagePullPolicy: IfNotPresent
          env:
            #Specify the startup type as k8s to bypass some restrictions of the official image initialization script
            - name: BUILD_TYPE
              value: "k8s"
          ports:
            - containerPort: 9060
              name: be-port
            - containerPort: 8040
              name: webserver-port
            - containerPort: 9050
              name: heartbeat-port
            - containerPort: 8060
              name: brpc-port
          volumeMounts:
              #Mount the configuration file in the way of configmap
            - name: conf
              mountPath: /opt/apache-doris/be/conf
              #Ifnot mounted, when enable_profile, error will be reported when querying the data from jdbc catalog
              #Error message: error setting certificate verify locations: CAfile:/etc/pki/tls/certs/ca-bundle.crt CApath: none
            - name: sys
              mountPath: /etc/pki
              readOnly: true
      volumes:
        - name: conf
          configMap:
            name: cn-conf
        - name: sys
          hostPath:
            path: /etc/pki
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cn-conf
data:
  be.conf: |
    PPROF_TMPDIR="$DORIS_HOME/log/"
    sys_log_level = INFO

    be_port = 9060
    webserver_port = 8040
    heartbeat_service_port = 9050
    brpc_port = 8060
    #Specify node type as calculation node
    be_node_role = computation
    priority_networks = 172.16.0.0/24

创建fe /backend cn的配置后,查看fe master的日志,没有添加cn节点: 2023-05-19 07:45:50,491 INFO (tablet checker|28) [TabletChecker.checkTablets():331] finished to check tablets. unhealth/total/added/in_sched/not_ready: 0/0/0/0/0, cost: 0 ms 2023-05-19 07:45:51,439 INFO (deployManager|40) [K8sDeployManager.getGroupHostPorts():155] get host port from group: doris-follower-cluster1: [xxx:9010, xxx:9010, xxx:9010] 2023-05-19 07:45:51,442 WARN (deployManager|40) [K8sDeployManager.getGroupHostPorts():128] get null endpoints of namespace default in service: doris-be-cluster1 2023-05-19 07:45:56,447 INFO (deployManager|40) [K8sDeployManager.getGroupHostPorts():155] get host port from group: doris-follower-cluster1: [xxx:9010, xxx:9010, xxx:9010] 2023-05-19 07:45:56,450 WARN (deployManager|40) [K8sDeployManager.getGroupHostPorts():128] get null endpoints of namespace default in service: doris-be-cluster1 2023-05-19 07:46:01,454 INFO (deployManager|40) [K8sDeployManager.getGroupHostPorts():155] get host port from group: doris-follower-cluster1: [xxx:9010, xxx:9010, xxx:9010] 2023-05-19 07:46:01,458 WARN (deployManager|40) [K8sDeployManager.getGroupHostPorts():128] get null endpoints of namespace default in service: doris-be-cluster1

show backends没有节点: mysql> show backends\G; Empty set (0.15 sec)

ERROR: No query specified

What You Expected?

自动的完成cn节点的添加

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

1HanJing commented 1 year ago

io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Endpoints] with name: [doris-follower-cluster1] in namespace: [xxxxxxxxxxxx] failed. 你在k8s上是否遇到上述类似的问题?我的fe.yml增加了如下环境变量

anthony-yau commented 1 year ago

io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Endpoints] with name: [doris-follower-cluster1] in namespace: [xxxxxxxxxxxx] failed. 你在k8s上是否遇到上述类似的问题?我的fe.yml增加了如下环境变量

  • name: APP_NAMESPACE value: test

也有,可以更新下be的启动脚本,参考:https://mp.weixin.qq.com/s?__biz=MzU3Njc2MjAyNg==&mid=2247486261&idx=1&sn=de7ad06fbd02d537550c671663ab463e&chksm=fd0fb2f0ca783be6d1cb0f1a3b4eeae7def33adf68103b0c0cb8d35c6f3456faa06b7ffe87ab&scene=178&cur_album_id=1690280491406376964#rd