druid-io / druid-operator

Druid Kubernetes Operator
Other
205 stars 93 forks source link

Coordinator/Overlord not being initiated with config #25

Closed EmiPhil closed 4 years ago

EmiPhil commented 4 years ago

I'm having a hard time getting the operator to launch a working cluster. From what I can tell from the logs, the coordinator and overlord do not seem to be getting initiated with the config in the yaml deployment (attached as help.txt). The other parts of the cluster seem to be able to find each other.

help.txt

Logs for coordinator:

 2020-03-09T20:21:01.946365074Z 2020-03-09T20:21:01+0000 startup service coordinator
2020-03-09T20:21:01.971997182Z Setting druid.host=10.44.2.78 in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
2020-03-09T20:21:03.19655891Z ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging. 

Logs for overlord:

 2020-03-09T21:58:57+0000 startup service overlord
Setting druid.host=10.44.3.7 in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging. 

Logs for Broker:

logs-from-druid-cluster-brokers-in-druid-cluster-brokers-0.txt


At first I was using the prebuilt image from docker to run the operator and ran into the same errors. I am now using a locally built docker image from cloning the repo and building it from there.

The error shows up if I use image: "apache/incubator-druid:0.16.0-incubating" in the configs as well.

I'm running out of ideas for what could be going wrong. Help would be greatly appreciated!

AdheipSingh commented 4 years ago

are you using the latest code in master ? I am running with 0.16 image it works great. Your config seems to be fine apart from autoscaler which has been changed to hpAutoscaler. I have a yaml which is running in one of my env. Do have look at it, it runs perfect. https://gist.github.com/AdheipSingh/cee71aecc1a0cd0f6ccf4cbcb324fc4d Since broker is not able to look into coordinator, can you confirm the k8s endpoints are up for coordinator. Can you check if your zookeeper is running fine ?

EmiPhil commented 4 years ago

I can see the service being created and pointing to the correct pod at port 8081 for the coordinator. I think the underlying issue is that it is using the config in /conf/druid/cluster/master/coordinator-overlord/runtime.properties instead of /conf/druid/cluster/master/coordinator/runtime.properties. If I log into the coordinator server and look at the configs:

/opt/apache-druid-0.17.0 $ cat conf/zk/zoo.cfg
#
# Server
#

tickTime=2000
dataDir=var/zk
clientPort=2181
initLimit=5
syncLimit=2

#
# Autopurge
#

autopurge.snapRetainCount=5
autopurge.purgeInterval=1

/opt/apache-druid-0.17.0 $ cat conf/druid/cluster/master/coordinator-overlord/runtime.properties 
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

druid.service=druid/coordinator
druid.plaintextPort=8081

druid.coordinator.startDelay=PT10S
druid.coordinator.period=PT5S

# Run the overlord service in the coordinator process
druid.coordinator.asOverlord.enabled=true
druid.coordinator.asOverlord.overlordService=druid/overlord

druid.indexer.queue.startDelay=PT5S

druid.indexer.runner.type=remote
druid.indexer.storage.type=metadata
/opt/apache-druid-0.17.0 $ cat conf/druid/cluster/master/coordinator/runtime.properties 
druid.port=8081
druid.service=druid/coordinator
druid.coordinator.startDelay=PT30S
druid.coordinator.period=PT30S
druid.coordinator.kill.on=true
druid.coordinator.kill.period=PT2H
druid.coordinator.kill.durationToRetain=PT0s
druid.coordinator.kill.maxSegments=5000

What should I look for to confirm zookeeper health?

AdheipSingh commented 4 years ago

yes keep the mount path to /conf/druid/cluster/master/coordinator/runtime.properties. It should work then Since service discovery is through zookeeper. Just to check the logs, i faced issues when i was recreating stacks for testing....better to purge zookeeper each time when testing new configs.

EmiPhil commented 4 years ago

I think that druid.sh is choosing the wrong runtime.properties:

/opt/apache-druid-0.17.0 $ cat /druid.sh
#!/bin/sh

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# NOTE: this is a 'run' script for the stock tarball
# It takes 1 required argument (the name of the service,
# e.g. 'broker', 'historical' etc). Any additional arguments
# are passed to that service.
#
# It accepts 'JAVA_OPTS' as an environment variable
#
# Additional env vars:
# - DRUID_LOG4J -- set the entire log4j.xml verbatim
# - DRUID_LOG_LEVEL -- override the default log level in default log4j
# - DRUID_XMX -- set Java Xmx
# - DRUID_XMS -- set Java Xms
# - DRUID_MAXNEWSIZE -- set Java max new size
# - DRUID_NEWSIZE -- set Java new size
# - DRUID_MAXDIRECTMEMORYSIZE -- set Java max direct memory size
#
# - DRUID_CONFIG -- full path to a file for druid 'common' properties
# - DRUID_CONFIG_${service} -- full path to a file for druid 'service' properties

set -e
SERVICE="$1"

echo "$(date -Is) startup service $SERVICE"

# We put all the config in /tmp/conf to allow for a
# read-only root filesystem
mkdir -p /tmp/conf/
cp -r /opt/druid/conf/druid /tmp/conf/druid

getConfPath() {
    cluster_conf_base=/tmp/conf/druid/cluster
    case "$1" in
    _common) echo $cluster_conf_base/_common ;;
    historical) echo $cluster_conf_base/data/historical ;;
    middleManager) echo $cluster_conf_base/data/middleManager ;;
    coordinator | overlord) echo $cluster_conf_base/master/coordinator-overlord ;;
    broker) echo $cluster_conf_base/query/broker ;;
    router) echo $cluster_conf_base/query/router ;;
    esac
}
COMMON_CONF_DIR=$(getConfPath _common)
SERVICE_CONF_DIR=$(getConfPath ${SERVICE})

[...]

I tried to change the env variable DRUIDCONFIG${service} but it seems to have no effect.

As far as I can tell, it isn't even getting to the point where it would try to talk to zookeeper because it's using the wrong runtime.properties.

AdheipSingh commented 4 years ago

did you try with by changing the mount path to /conf/druid/cluster/master/coordinator/runtime.properties. The druid.sh is the same what i am using. Did you try the tiny-cluster.yaml in examples.

himanshug commented 4 years ago

@EmiPhil haven't had a chance to look in detail but noticed....

druid.zk.service.host=zk-zookeeper-headless.default.svc.cluster.local is that the name of headless service covering all zookeeper pods or you have just one zookeeper pod in the quorum ? If you have multiple zookeeper pods, You need to explicitly provide list of all zookeeper pods and not name of the headless service.

EmiPhil commented 4 years ago

@AdheipSingh Yeah the nodeConfigMountPath has always been /conf/druid/cluster/master/coordinator. When I log into the pod I can see the configuration is being put into that folder, but there is also another runtime.properties files in /conf/druid/cluster/master/coordinator-overlord and druid seems to be preferring to use that configuration.

@himanshug

Yep I have multiple. Fixed that in this new config, but the issue is still occuring.

Latest configuration:

apiVersion: druid.apache.org/v1alpha1
kind: Druid
metadata:
  name: cluster
spec:
  image: apache/druid:0.17.0
  env:
    - name: GOOGLE_APPLICATION_CREDENTIALS
      value: /secrets/GOOGLE_APPLICATION_CREDENTIALS
  startScript: /druid.sh
  securityContext:
    fsGroup: 1000
    runAsUser: 1000
    runAsGroup: 1000
  services:
    - spec:
        type: ClusterIP
        clusterIP: None
  commonConfigMountPath: "/opt/druid/conf/druid/cluster/_common"
  jvm.options: |-
    -server
    -XX:+PrintFlagsFinal
    -XX:MaxDirectMemorySize=10240g
    -XX:+UnlockExperimentalVMOptions
    -XX:+UseCGroupMemoryLimitForHeap
    -Duser.timezone=UTC
    -Dfile.encoding=UTF-8
    -Dlog4j.debug
    -XX:+ExitOnOutOfMemoryError
    -XX:HeapDumpPath=/druid/data/logs
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:+UseG1GC
    -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
    -XX:+UnlockDiagnosticVMOptions
    -XX:+PrintSafepointStatistics
    -XX:PrintSafepointStatisticsCount=1
    -XX:+PrintGCDetails
    -XX:+PrintGCDateStamps
    -XX:+PrintGCApplicationStoppedTime
    -XX:+PrintGCApplicationConcurrentTime
    -XX:+UseGCLogFileRotation
    -XX:NumberOfGCLogFiles=50
    -XX:GCLogFileSize=50m
    -Xloggc:/druid/data/logs/gc.log
  common.runtime.properties: |
    #
    # Monitoring
    #
    druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"]
    #druid.emitter=noop
    druid.emitter.logging.logLevel=debug

    #
    # Extensions
    #
    druid.extensions.loadList=["druid-google-extensions","druid-kafka-indexing-service","druid-datasketches","postgresql-metadata-storage","druid-protobuf-extensions","druid-stats"]

    # Log all runtime properties on startup. Disable to avoid logging properties on startup:
    druid.startup.logging.logProperties=true
    #
    # Service discovery
    #
    druid.selectors.indexing.serviceName=druid/overlord
    druid.selectors.coordinator.serviceName=druid/coordinator
    druid.sql.enable=true
  deepStorage:
    spec:
      properties: |-
        druid.storage.type=google
        druid.google.bucket=bucket
        druid.indexer.logs.directory=data/logs/
    type: default
  metadataStore:
    spec:
      properties: |-
        druid.metadata.storage.type=postgresql
        druid.metadata.storage.connector.connectURI=jdbc:postgresql://host
        druid.metadata.postgres.ssl.useSSL=true
        druid.metadata.postgres.ssl.sslMode="verify-ca"
        druid.metadata.postgres.ssl.sslCert="/secrets/client-cert.pem"
        druid.metadata.postgres.ssl.sslKey="/secrets/client-key.pem"
        druid.metadata.postgres.ssl.sslRootCert="/secrets/server-ca.pem"
        druid.metadata.storage.connector.user=user
        druid.metadata.storage.connector.password=password
        druid.metadata.storage.connector.createTables=true
        druid.metadata.postgres.dbTableSchema=schema
    type: default
  zookeeper:
    spec:
      properties: |-
        druid.zk.service.host=zk-zookeeper-0.zk-zookeeper-headless.default.svc.cluster.local,zk-zookeeper-1.zk-zookeeper-headless.default.svc.cluster.local,zk-zookeeper-2.zk-zookeeper-headless.default.svc.cluster.local
        druid.zk.paths.base=/druid
    type: default
  nodes:
    brokers:
      nodeType: "broker"
      druid.port: 8082
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/broker"
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      replicas: 1
      runtime.properties: |
        druid.service=druid/broker
        druid.plaintextPort=8082
        # HTTP server settings
        druid.server.http.numThreads=25
        # HTTP client settings
        druid.broker.http.numConnections=5
        # Processing threads and buffers
        druid.processing.buffer.sizeBytes=1073741824
        druid.processing.numThreads=1
        druid.processing.tmpDir=var/druid/processing
        druid.broker.retryPolicy.numTries=3
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
            <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="debug">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      extra.jvm.options: |-
        -Xmx2G
        -Xms2G
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            storageClassName: standard
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
        - mountPath: /secrets
          name: secrets
          readOnly: true
      volumes:
        - name: data-volume
          emptyDir: {}
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      resources:
        requests:
          memory: "6G"
          cpu: "1"
        limits:
          memory: "6G"
          cpu: "1"
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8082
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8082
      services:
        - metadata:
            name: broker-%s-service
          spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8082
                targetPort: 8082
            type: ClusterIP
      hpAutoscaler:
        maxReplicas: 10
        minReplicas: 1
        scaleTargetRef:
          apiVersion: apps/v1
          kind: StatefulSet
          name: druid-cluster-brokers
        metrics:
          - type: Resource
            resource:
              name: cpu
              targetAverageUtilization: 60
          - type: Resource
            resource:
              name: memory
              targetAverageUtilization: 60

    coordinators:
      nodeType: "coordinator"
      druid.port: 8081
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/master/coordinator"
      replicas: 1
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      runtime.properties: |
        druid.service=druid/coordinator
        druid.coordinator.startDelay=PT30S
        druid.coordinator.period=PT30S
        druid.coordinator.kill.on=true
        druid.coordinator.kill.period=PT2H
        druid.coordinator.kill.durationToRetain=PT0s
        druid.coordinator.kill.maxSegments=5000
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
            <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="debug">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      services:
        - metadata:
            name: coordinator-%s-service
          spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8081
                targetPort: 8081
            type: ClusterIP
      extra.jvm.options: |-
        -Xmx1G
        -Xms1G
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8081
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8081
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            storageClassName: standard
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
        - mountPath: /secrets
          name: secrets
          readOnly: true
      volumes:
        - name: data-volume
          emptyDir: {}
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      resources:
        limits:
          cpu: "1"
          memory: 6G
        requests:
          cpu: "1"
          memory: 6G
      hpAutoscaler:
        maxReplicas: 10
        minReplicas: 1
        scaleTargetRef:
          apiVersion: apps/v1
          kind: StatefulSet
          name: druid-cluster-coordinators
        metrics:
          - type: Resource
            resource:
              name: cpu
              targetAverageUtilization: 60
          - type: Resource
            resource:
              name: memory
              targetAverageUtilization: 60

    historicals:
      nodeType: "historical"
      druid.port: 8083
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/data/historical"
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      replicas: 1
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8083
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8083
      runtime.properties: |
        druid.service=druid/historical
        druid.server.http.numThreads=10
        druid.processing.buffer.sizeBytes=1073741824
        druid.processing.numMergeBuffers=1
        druid.processing.numThreads=2
        # Segment storage
        druid.segmentCache.locations=[{\"path\":\"/druid/data/segments\",\"maxSize\":1099511627776}]
        druid.server.maxSize=1099511627776
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
            <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="debug">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      extra.jvm.options: |-
        -Xmx1G
        -Xms1G
      services:
        - spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8083
                targetPort: 8083
            type: ClusterIP
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 200Gi
            storageClassName: ssd
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
        - mountPath: /secrets
          name: secrets
          readOnly: true
      volumes:
        - name: data-volume
          emptyDir: {}
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      resources:
        limits:
          cpu: "1"
          memory: 8G
        requests:
          cpu: "1"
          memory: 8G

    middlemanagers:
      druid.port: 8091
      extra.jvm.options: |-
        -Xmx4G
        -Xms4G
      nodeType: middleManager
      nodeConfigMountPath: /opt/druid/conf/druid/cluster/data/middlemanager
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      ports:
        - containerPort: 8100
          name: peon-0-pt
        - containerPort: 8101
          name: peon-1-pt
        - containerPort: 8102
          name: peon-2-pt
        - containerPort: 8103
          name: peon-3-pt
        - containerPort: 8104
          name: peon-4-pt
      replicas: 1
      resources:
        limits:
          cpu: "2"
          memory: 5Gi
        requests:
          cpu: "2"
          memory: 5Gi
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8091
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8091
      runtime.properties: |-
        druid.service=druid/middleManager
        druid.worker.capacity=4
        druid.indexer.runner.javaOpts=-server -XX:MaxDirectMemorySize=10240g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/druid/data/tmp -Dlog4j.debug -XX:+UnlockDiagnosticVMOptions -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=10m -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -XX:HeapDumpPath=/druid/data/logs/peon.%t.%p.hprof -Xms10G -Xmx10G
        druid.indexer.task.baseTaskDir=/druid/data/baseTaskDir
        druid.server.http.numThreads=10
        druid.indexer.fork.property.druid.processing.buffer.sizeBytes=1
        druid.indexer.fork.property.druid.processing.numMergeBuffers=1
        druid.indexer.fork.property.druid.processing.numThreads=1
        # Processing threads and buffers on Peons
        druid.indexer.fork.property.druid.processing.numMergeBuffers=2
        druid.indexer.fork.property.druid.processing.buffer.sizeBytes=100000000
        druid.indexer.fork.property.druid.processing.numThreads=1
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
              <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="info">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      services:
        - spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8091
                targetPort: 8091
              - name: peon-port-0
                port: 8100
                targetPort: 8100
              - name: peon-port-1
                port: 8101
                targetPort: 8101
              - name: peon-port-2
                port: 8102
                targetPort: 8102
              - name: peon-port-3
                port: 8103
                targetPort: 8103
              - name: peon-port-4
                port: 8104
                targetPort: 8104
            type: ClusterIP
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
            storageClassName: ssd
      volumeMounts:
        - mountPath: /secrets
          name: secrets
          readOnly: true
        - mountPath: /druid/data
          name: data-volume
      volumes:
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      securityContext:
        fsGroup: 0
        runAsGroup: 0
        runAsUser: 0
      hpAutoscaler:
        maxReplicas: 10
        minReplicas: 1
        scaleTargetRef:
          apiVersion: apps/v1
          kind: StatefulSet
          name: druid-cluster-middlemanagers
        metrics:
          - type: Resource
            resource:
              name: cpu
              targetAverageUtilization: 60
          - type: Resource
            resource:
              name: memory
              targetAverageUtilization: 60

    overlords:
      druid.port: 8090
      extra.jvm.options: |-
        -Xmx4G
        -Xms4G
      nodeType: overlord
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      nodeConfigMountPath: /opt/druid/conf/druid/cluster/master/overlord
      replicas: 1
      resources:
        limits:
          cpu: "2"
          memory: 6Gi
        requests:
          cpu: "2"
          memory: 6Gi
      runtime.properties: |-
        druid.service=druid/overlord
        druid.indexer.queue.startDelay=PT30S
        druid.indexer.runner.type=remote
        druid.indexer.storage.type=metadata
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
            <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="debug">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8081
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8081
      services:
        - metadata:
            name: overlord-%s-service
          spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8090
                targetPort: 8090
            type: ClusterIP
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            storageClassName: standard
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
        - mountPath: /secrets
          name: secrets
          readOnly: true
      volumes:
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      securityContext:
        fsGroup: 1000
        runAsGroup: 1000
        runAsUser: 1000
      hpAutoscaler:
        maxReplicas: 10
        minReplicas: 1
        scaleTargetRef:
          apiVersion: apps/v1
          kind: StatefulSet
          name: druid-cluster-overlords
        metrics:
          - type: Resource
            resource:
              name: cpu
              targetAverageUtilization: 60
          - type: Resource
            resource:
              name: memory
              targetAverageUtilization: 60

    routers:
      livenessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8888
      readinessProbe:
        initialDelaySeconds: 30
        httpGet:
          path: /status/health
          port: 8888
      druid.port: 8888
      extra.jvm.options: |-
        -Xmx512m
        -Xms512m
      nodeType: router
      podDisruptionBudgetSpec:
        maxUnavailable: 1
      nodeConfigMountPath: /opt/druid/conf/druid/cluster/query/router
      replicas: 1
      runtime.properties: |
        druid.service=druid/router
        druid.plaintextPort=8888
        # HTTP proxy
        druid.router.http.numConnections=50
        druid.router.http.readTimeout=PT5M
        druid.router.http.numMaxThreads=100
        druid.server.http.numThreads=100
        # Service discovery
        druid.router.defaultBrokerServiceName=druid/broker
        druid.router.coordinatorServiceName=druid/coordinator
        # Management proxy to coordinator / overlord: required for unified web console.
        druid.router.managementProxy.enabled=true
      log4j.config: |-
        <Configuration status="WARN">
          <Appenders>
            <Console name="logline" target="SYSTEM_OUT">
              <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
            <Console name="msgonly" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          </Appenders>
          <Loggers>
            <Root level="info">
              <AppenderRef ref="logline"/>
            </Root>
            <Logger name="org.apache.druid.java.util.emitter.core.LoggingEmitter" additivity="false" level="debug">
              <AppenderRef ref="msgonly"/>
            </Logger>
          </Loggers>
        </Configuration>
      services:
        - metadata:
            name: router-%s-service
          spec:
            clusterIP: None
            ports:
              - name: tcp-service-port
                port: 8888
                targetPort: 8888
            type: ClusterIP
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ssd
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
        - mountPath: /secrets
          name: secrets
          readOnly: true
      volumes:
        - name: secrets
          projected:
            sources:
              - secret:
                  name: druid-gcloud-bucket-key
              - secret:
                  name: cloud-sql
      securityContext:
        fsGroup: 1000
        runAsGroup: 1000
        runAsUser: 1000
EmiPhil commented 4 years ago

@AdheipSingh

From what I can see in the logs, the following config (slightly modified from tiny-cluster) does work.

apiVersion: "druid.apache.org/v1alpha1"
kind: "Druid"
metadata:
  name: tiny-cluster
spec:
  image: apache/druid:0.17.0
  startScript: /druid.sh
  securityContext:
    fsGroup: 1000
    runAsUser: 1000
    runAsGroup: 1000
  services:
    - spec:
        type: ClusterIP
        clusterIP: None
  commonConfigMountPath: "/opt/druid/conf/druid/cluster/_common"
  jvm.options: |-
    -server
    -XX:MaxDirectMemorySize=10240g
    -Duser.timezone=UTC
    -Dfile.encoding=UTF-8
    -Dlog4j.debug
    -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  log4j.config: |-
    <?xml version="1.0" encoding="UTF-8" ?>
    <Configuration status="WARN">
        <Appenders>
            <Console name="Console" target="SYSTEM_OUT">
                <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
        </Appenders>
        <Loggers>
            <Root level="info">
                <AppenderRef ref="Console"/>
            </Root>
        </Loggers>
    </Configuration>
  common.runtime.properties: |
    # Zookeeper
    druid.zk.service.host=zk-zookeeper-0.zk-zookeeper-headless.default.svc.cluster.local,zk-zookeeper-1.zk-zookeeper-headless.default.svc.cluster.local,zk-zookeeper-2.zk-zookeeper-headless.default.svc.cluster.local
    druid.zk.paths.base=/druid-tiny
    druid.zk.service.compress=false
    # Metadata Store
    druid.metadata.storage.type=derby
    druid.metadata.storage.type=derby
    druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/var/druid/metadata.db;create=true
    druid.metadata.storage.connector.host=localhost
    druid.metadata.storage.connector.port=1527
    druid.metadata.storage.connector.createTables=true
    # Deep Storage
    druid.storage.type=local
    druid.storage.storageDirectory=/druid/data/deepstorage
    #
    # Extensions
    #
    druid.extensions.loadList=["druid-s3-extensions"]
    #
    # Service discovery
    #
    druid.selectors.indexing.serviceName=druid/overlord
    druid.selectors.coordinator.serviceName=druid/coordinator
  nodes:
    brokers:
      nodeType: "broker"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/broker"
      replicas: 1
      runtime.properties: |
        druid.service=druid/broker
        # HTTP server threads
        druid.broker.http.numConnections=5
        druid.server.http.numThreads=10
        # Processing threads and buffers
        druid.processing.buffer.sizeBytes=1
        druid.processing.numMergeBuffers=1
        druid.processing.numThreads=1
        druid.sql.enable=false
      extra.jvm.options: |-
        -Xmx1G
        -Xms1G
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
      volumes:
        - name: data-volume
          emptyDir: {}
      resources:
        requests:
          memory: "2G"
          cpu: "2"
        limits:
          memory: "2G"
          cpu: "2"

    coordinators:
      nodeType: "coordinator"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/master/coordinator-overlord"
      replicas: 1
      runtime.properties: |
        druid.service=druid/coordinator
        # HTTP server threads
        druid.coordinator.startDelay=PT30S
        druid.coordinator.period=PT30S
        # Configure this coordinator to also run as Overlord
        druid.coordinator.asOverlord.enabled=true
        druid.coordinator.asOverlord.overlordService=druid/overlord
        druid.indexer.queue.startDelay=PT30S
        druid.indexer.runner.type=local
      extra.jvm.options: |-
        -Xmx1G
        -Xms1G
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
      volumes:
        - name: data-volume
          emptyDir: {}
      resources:
        requests:
          memory: "2G"
          cpu: "2"
        limits:
          memory: "2G"
          cpu: "2"

    historicals:
      nodeType: "historical"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/data/historical"
      replicas: 1
      runtime.properties: |
        druid.service=druid/historical
        druid.server.http.numThreads=5
        druid.processing.buffer.sizeBytes=1
        druid.processing.numMergeBuffers=1
        druid.processing.numThreads=1
        # Segment storage
        druid.segmentCache.locations=[{\"path\":\"/druid/data/segments\",\"maxSize\":10737418240}]
        druid.server.maxSize=10737418240
      extra.jvm.options: |-
        -Xmx1G
        -Xms1G
      volumeMounts:
        - mountPath: /druid/data
          name: data-volume
      volumes:
        - name: data-volume
          emptyDir: {}
      resources:
        requests:
          memory: "2G"
          cpu: "2"
        limits:
          memory: "2G"
          cpu: "2"
EmiPhil commented 4 years ago

I got it to work by setting nodeConfigMountPath: /opt/druid/conf/druid/cluster/master/coordinator-overlord on both the coordinator and the overlords.

I think this works because for whatever reason druid is always choosing to use /opt/druid/conf/druid/cluster/master/coordinator-overlord/runtime.properties, even in the presence of /opt/druid/conf/druid/cluster/master/coordinator/runtime.properties.

As far as I can tell, there doesn't seem to be a problem with having the configurations in that folder. The unified web console correctly shows the overlord and coordinator on separate pod hosts, so all good?

himanshug commented 4 years ago

@EmiPhil thanks for documenting the solution. yes start scripts in Druid's docker image does have the behavior you described looking for "coordinator-overlord" on both coordinator as well as overlord pods.