apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.33k stars 3.66k forks source link

Misconfigured coordinator service in helm chart #11992

Closed alborotogarcia closed 2 years ago

alborotogarcia commented 2 years ago

Unable to connect router to coordinator service.

Templating the helm chart a bit (getting rid of bitnami's amd64 zookeeper, and postgres mainly) and deploying the official helm chart I can't connect the router to any other services apparently.

image

However the coordinator seems to be connected to other services, image image

And the druid doctor shows the following image

Although all my pods seem to be running smoothly image

druid-zookeeper-headless       ClusterIP   None            <none>        2181/TCP,3888/TCP,2888/TCP   3h4m
druid-zookeeper                ClusterIP   10.43.127.52    <none>        2181/TCP                     3h4m
druid-broker                   ClusterIP   10.43.35.55     <none>        8082/TCP                     3h4m
druid-coordinator              ClusterIP   10.43.38.218    <none>        8081/TCP                     3h4m
druid-historical               ClusterIP   10.43.173.83    <none>        8083/TCP                     3h4m
druid-middle-manager           ClusterIP   10.43.146.5     <none>        8091/TCP                     3h4m
druid-router                   ClusterIP   10.43.135.124   <none>        8888/TCP                     3h4m

Affected Version

arm64 build 0.22.0 @98957be0443b669cc7464886ef9ee21d3d21f762

Description

Please include as much detailed information about the problem as possible.


Coordinator trace

druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:42,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.EmitClusterStatsAndMetrics - Load Queues: druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:42,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.EmitClusterStatsAndMetrics - Server[:8083, historical, _default_tier] has 0 left to load, 0 left to drop, 0 bytes queued, 0 bytes served. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,842 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.LogUsedSegments - Found [0] used segments. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,843 ERROR [Coordinator-Exec--0] org.apache.druid.server.coordinator.DruidCoordinator - Caught exception, ignoring so that schedule keeps going.: {class=org.apache.druid.server.coordinator.DruidCoordinator, exceptionType=class java.lang.RuntimeException, exceptionMessage=org.apache.druid.java.util.common.IOE: No known server} druid-coordinator-f4cd776f5-xj6g9 druid java.lang.RuntimeException: org.apache.druid.java.util.common.IOE: No known server druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.client.indexing.HttpIndexingServiceClient.getTasks(HttpIndexingServiceClient.java:266) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.client.indexing.HttpIndexingServiceClient.getActiveTasks(HttpIndexingServiceClient.java:231) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.server.coordinator.KillStalePendingSegments.run(KillStalePendingSegments.java:55) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.server.coordinator.DruidCoordinator$DutiesRunnable.run(DruidCoordinator.java:910) [druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:720) [druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:713) [druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$4.run(ScheduledExecutors.java:163) [druid-core-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312] druid-coordinator-f4cd776f5-xj6g9 druid Caused by: org.apache.druid.java.util.common.IOE: No known server druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.discovery.DruidLeaderClient.getCurrentKnownLeader(DruidLeaderClient.java:267) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:122) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid at org.apache.druid.client.indexing.HttpIndexingServiceClient.getTasks(HttpIndexingServiceClient.java:251) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-coordinator-f4cd776f5-xj6g9 druid ... 13 more druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.LogUsedSegments - Found [0] used segments. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant create queue is empty. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.BalanceSegments - Metadata segments are not available. Cannot balance. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.EmitClusterStatsAndMetrics - Load Queues: druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:47,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.EmitClusterStatsAndMetrics - Server[:8083, historical, _default_tier] has 0 left to load, 0 left to drop, 0 bytes queued, 0 bytes served. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:52,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.LogUsedSegments - Found [0] used segments. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:52,962 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant create queue is empty. druid-coordinator-f4cd776f5-xj6g9 druid 2021-11-26T04:24:52,963 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.duty.BalanceSegments - Metadata segments are not available. Cannot balance.

Broker trace

druid-broker-579ffb6b97-jtc5b druid 2021-11-26T01:24:36,788 INFO [ServerInventoryView-0] org.apache.druid.client.BatchServerInventoryView - Server Disappeared[DruidServerMetadata{name=':8083', hostAndPort=':8083', hostAndTlsPort='null', maxSize=300000000000, tier='_default_tier', type=historical, priority=0}] druid-broker-579ffb6b97-jtc5b druid 2021-11-26T01:24:36,788 INFO [ServerInventoryView-0] org.apache.druid.client.BatchServerInventoryView - Server Disappeared[DruidServerMetadata{name=':8083', hostAndPort=':8083', hostAndTlsPort='null', maxSize=300000000000, tier='_default_tier', type=historical, priority=0}] druid-broker-579ffb6b97-jtc5b druid 2021-11-26T01:24:36,832 ERROR [NodeRoleWatcher[COORDINATOR]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher - Unknown error in node watcher of role[coordinator]. druid-broker-579ffb6b97-jtc5b druid java.lang.RuntimeException: java.net.URISyntaxException: Expected hostname at index 7: http://:8081 druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.server.DruidNode.getUriToUse(DruidNode.java:292) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.discovery.BaseNodeRoleWatcher.childAdded(BaseNodeRoleWatcher.java:131) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.childAdded(CuratorDruidNodeDiscoveryProvider.java:271) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.handleChildEvent(CuratorDruidNodeDiscoveryProvider.java:237) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.lambda$new$0(CuratorDruidNodeDiscoveryProvider.java:205) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:538) [curator-recipes-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:532) [curator-recipes-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [curator-framework-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [curator-client-4.3.0.jar:?] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [curator-framework-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:530) [curator-recipes-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:808) [curator-recipes-4.3.0.jar:4.3.0] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid Caused by: java.net.URISyntaxException: Expected hostname at index 7: http://:8081 druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.fail(URI.java:2847) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.failExpecting(URI.java:2853) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.parseHostname(URI.java:3389) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.parseServer(URI.java:3235) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.parseAuthority(URI.java:3154) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.parseHierarchical(URI.java:3096) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI$Parser.parse(URI.java:3052) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at java.net.URI.(URI.java:673) ~[?:1.8.0_312] druid-broker-579ffb6b97-jtc5b druid at org.apache.druid.server.DruidNode.getUriToUse(DruidNode.java:289) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-broker-579ffb6b97-jtc5b druid ... 19 more druid-broker-579ffb6b97-jtc5b druid 2021-11-26T01:24:36,838 INFO [ServerInventoryView-0] org.apache.druid.client.BatchServerInventoryView - New Server[DruidServerMetadata{name=':8083', hostAndPort=':8083', hostAndTlsPort='null', maxSize=300000000000, tier='_default_tier', type=historical, priority=0}] druid-broker-579ffb6b97-jtc5b druid 2021-11-26T01:24:36,855 INFO [ServerInventoryView-0] org.apache.druid.client.BatchServerInventoryView - New Server[DruidServerMetadata{name=':8083', hostAndPort=':8083', hostAndTlsPort='null', maxSize=300000000000, tier='_default_tier', type=historical, priority=0}]

Historical

druid-historical-0 druid 2021-11-26T01:24:36,660 INFO [Announcer-0] org.apache.druid.curator.announcement.Announcer - Node[/druid/announcements/:8083] dropped, reinstating. druid-historical-0 druid 2021-11-26T01:24:36,799 ERROR [NodeRoleWatcher[COORDINATOR]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher - Unknown error in node watcher of role[coordinator]. druid-historical-0 druid java.lang.RuntimeException: java.net.URISyntaxException: Expected hostname at index 7: http://:8081 druid-historical-0 druid at org.apache.druid.server.DruidNode.getUriToUse(DruidNode.java:292) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid at org.apache.druid.discovery.BaseNodeRoleWatcher.childAdded(BaseNodeRoleWatcher.java:131) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.childAdded(CuratorDruidNodeDiscoveryProvider.java:271) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.handleChildEvent(CuratorDruidNodeDiscoveryProvider.java:237) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid at org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher.lambda$new$0(CuratorDruidNodeDiscoveryProvider.java:205) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:538) [curator-recipes-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:532) [curator-recipes-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [curator-framework-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) [curator-client-4.3.0.jar:?] druid-historical-0 druid at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) [curator-framework-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:530) [curator-recipes-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-4.3.0.jar:4.3.0] druid-historical-0 druid at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:808) [curator-recipes-4.3.0.jar:4.3.0] druid-historical-0 druid at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312] druid-historical-0 druid at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312] druid-historical-0 druid at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312] druid-historical-0 druid at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_312] druid-historical-0 druid at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312] druid-historical-0 druid at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312] druid-historical-0 druid at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312] druid-historical-0 druid Caused by: java.net.URISyntaxException: Expected hostname at index 7: http://:8081 druid-historical-0 druid at java.net.URI$Parser.fail(URI.java:2847) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.failExpecting(URI.java:2853) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.parseHostname(URI.java:3389) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.parseServer(URI.java:3235) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.parseAuthority(URI.java:3154) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.parseHierarchical(URI.java:3096) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI$Parser.parse(URI.java:3052) ~[?:1.8.0_312] druid-historical-0 druid at java.net.URI.(URI.java:673) ~[?:1.8.0_312] druid-historical-0 druid at org.apache.druid.server.DruidNode.getUriToUse(DruidNode.java:289) ~[druid-server-0.23.0-SNAPSHOT.jar:0.23.0-SNAPSHOT] druid-historical-0 druid ... 19 more

asdf2014 commented 2 years ago

@alborotogarcia Please provide the full value.yaml file if you can, thanks

alborotogarcia commented 2 years ago

Thanks for your kind reply @asdf2014 !

As I said I use an external postgres cluster, and try to avoid bitnami images as they only run on amd64 atm

Please don't hesitate to ask for any additional information I can provide.

---
# Source: druid/charts/zookeeper/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: druid-zookeeper
  labels:
    app: zookeeper
    chart: zookeeper-2.1.4
    release: druid
    heritage: Helm
    component: server
spec:
  selector:
    matchLabels:
      app: zookeeper
      release: druid
      component: server
  maxUnavailable: 1
---
# Source: druid/charts/zookeeper/templates/config-script.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: druid-zookeeper
  labels:
    app: zookeeper
    chart: zookeeper-2.1.4
    release: druid
    heritage: Helm
    component: server
data:
    ok: |
      #!/bin/sh
      zkServer.sh status

    ready: |
      #!/bin/sh
      echo ruok | nc 127.0.0.1 ${1:-2181}

    run: |
      #!/bin/bash

      set -a
      ROOT=$(echo /apache-zookeeper-*)

      ZK_USER=${ZK_USER:-"zookeeper"}
      ZK_LOG_LEVEL=${ZK_LOG_LEVEL:-"INFO"}
      ZK_DATA_DIR=${ZK_DATA_DIR:-"/data"}
      ZK_DATA_LOG_DIR=${ZK_DATA_LOG_DIR:-"/data/log"}
      ZK_CONF_DIR=${ZK_CONF_DIR:-"/conf"}
      ZK_CLIENT_PORT=${ZK_CLIENT_PORT:-2181}
      ZK_SERVER_PORT=${ZK_SERVER_PORT:-2888}
      ZK_ELECTION_PORT=${ZK_ELECTION_PORT:-3888}
      ZK_TICK_TIME=${ZK_TICK_TIME:-2000}
      ZK_INIT_LIMIT=${ZK_INIT_LIMIT:-10}
      ZK_SYNC_LIMIT=${ZK_SYNC_LIMIT:-5}
      ZK_HEAP_SIZE=${ZK_HEAP_SIZE:-2G}
      ZK_MAX_CLIENT_CNXNS=${ZK_MAX_CLIENT_CNXNS:-60}
      ZK_MIN_SESSION_TIMEOUT=${ZK_MIN_SESSION_TIMEOUT:- $((ZK_TICK_TIME*2))}
      ZK_MAX_SESSION_TIMEOUT=${ZK_MAX_SESSION_TIMEOUT:- $((ZK_TICK_TIME*20))}
      ZK_SNAP_RETAIN_COUNT=${ZK_SNAP_RETAIN_COUNT:-3}
      ZK_PURGE_INTERVAL=${ZK_PURGE_INTERVAL:-0}
      ID_FILE="$ZK_DATA_DIR/myid"
      ZK_CONFIG_FILE="$ZK_CONF_DIR/zoo.cfg"
      LOG4J_PROPERTIES="$ZK_CONF_DIR/log4j.properties"
      HOST=$(hostname)
      DOMAIN=`hostname -d`
      JVMFLAGS="-Xmx$ZK_HEAP_SIZE -Xms$ZK_HEAP_SIZE"

      APPJAR=$(echo $ROOT/*jar)
      CLASSPATH="${ROOT}/lib/*:${APPJAR}:${ZK_CONF_DIR}:"

      if [[ $HOST =~ (.*)-([0-9]+)$ ]]; then
          NAME=${BASH_REMATCH[1]}
          ORD=${BASH_REMATCH[2]}
          MY_ID=$((ORD+1))
      else
          echo "Failed to extract ordinal from hostname $HOST"
          exit 1
      fi

      mkdir -p $ZK_DATA_DIR
      mkdir -p $ZK_DATA_LOG_DIR
      echo $MY_ID >> $ID_FILE

      echo "clientPort=$ZK_CLIENT_PORT" >> $ZK_CONFIG_FILE
      echo "dataDir=$ZK_DATA_DIR" >> $ZK_CONFIG_FILE
      echo "dataLogDir=$ZK_DATA_LOG_DIR" >> $ZK_CONFIG_FILE
      echo "tickTime=$ZK_TICK_TIME" >> $ZK_CONFIG_FILE
      echo "initLimit=$ZK_INIT_LIMIT" >> $ZK_CONFIG_FILE
      echo "syncLimit=$ZK_SYNC_LIMIT" >> $ZK_CONFIG_FILE
      echo "maxClientCnxns=$ZK_MAX_CLIENT_CNXNS" >> $ZK_CONFIG_FILE
      echo "minSessionTimeout=$ZK_MIN_SESSION_TIMEOUT" >> $ZK_CONFIG_FILE
      echo "maxSessionTimeout=$ZK_MAX_SESSION_TIMEOUT" >> $ZK_CONFIG_FILE
      echo "autopurge.snapRetainCount=$ZK_SNAP_RETAIN_COUNT" >> $ZK_CONFIG_FILE
      echo "autopurge.purgeInterval=$ZK_PURGE_INTERVAL" >> $ZK_CONFIG_FILE
      echo "4lw.commands.whitelist=*" >> $ZK_CONFIG_FILE

      for (( i=1; i<=$ZK_REPLICAS; i++ ))
      do
          echo "server.$i=$NAME-$((i-1)).$DOMAIN:$ZK_SERVER_PORT:$ZK_ELECTION_PORT" >> $ZK_CONFIG_FILE
      done

      rm -f $LOG4J_PROPERTIES

      echo "zookeeper.root.logger=$ZK_LOG_LEVEL, CONSOLE" >> $LOG4J_PROPERTIES
      echo "zookeeper.console.threshold=$ZK_LOG_LEVEL" >> $LOG4J_PROPERTIES
      echo "zookeeper.log.threshold=$ZK_LOG_LEVEL" >> $LOG4J_PROPERTIES
      echo "zookeeper.log.dir=$ZK_DATA_LOG_DIR" >> $LOG4J_PROPERTIES
      echo "zookeeper.log.file=zookeeper.log" >> $LOG4J_PROPERTIES
      echo "zookeeper.log.maxfilesize=256MB" >> $LOG4J_PROPERTIES
      echo "zookeeper.log.maxbackupindex=10" >> $LOG4J_PROPERTIES
      echo "zookeeper.tracelog.dir=$ZK_DATA_LOG_DIR" >> $LOG4J_PROPERTIES
      echo "zookeeper.tracelog.file=zookeeper_trace.log" >> $LOG4J_PROPERTIES
      echo "log4j.rootLogger=\${zookeeper.root.logger}" >> $LOG4J_PROPERTIES
      echo "log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender" >> $LOG4J_PROPERTIES
      echo "log4j.appender.CONSOLE.Threshold=\${zookeeper.console.threshold}" >> $LOG4J_PROPERTIES
      echo "log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout" >> $LOG4J_PROPERTIES
      echo "log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n" >> $LOG4J_PROPERTIES

      if [ -n "$JMXDISABLE" ]
      then
          MAIN=org.apache.zookeeper.server.quorum.QuorumPeerMain
      else
          MAIN="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=$JMXPORT -Dcom.sun.management.jmxremote.authenticate=$JMXAUTH -Dcom.sun.management.jmxremote.ssl=$JMXSSL -Dzookeeper.jmx.log4j.disable=$JMXLOG4J org.apache.zookeeper.server.quorum.QuorumPeerMain"
      fi

      set -x
      exec java -cp "$CLASSPATH" $JVMFLAGS $MAIN $ZK_CONFIG_FILE
---
# Source: druid/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: druid
  labels:
    app: druid
    chart: druid-0.3.0
    release: druid
    heritage: Helm
data:
  DRUID_USE_CONTAINER_IP: "true"
  druid_emitter: noop
  druid_emitter_http_recipientBaseUrl: http://druid_exporter_url:druid_exporter_port/druid
  druid_emitter_logging_logLevel: debug
  druid_extensions_loadList: '["druid-histogram", "druid-datasketches", "druid-lookups-cached-global",
    "postgresql-metadata-storage"]'
  druid_indexer_logs_directory: /opt/data/indexing-logs
  druid_indexer_logs_type: file
  druid_metadata_storage_connector_connectURI: jdbc:postgresql://acid-minimal-cluster.storage:5432/druid
  druid_metadata_storage_connector_password: druid
  druid_metadata_storage_connector_user: druid
  druid_metadata_storage_type: postgresql
  druid_storage_type: local
  druid_zk_service_host: druid-zookeeper-headless:2181
  druid_metadata_storage_type: postgresql
  druid.selectors.coordinator.serviceName: druid-coordinator
  druid.selectors.indexing.serviceName: druid-overlord

---
# Source: druid/charts/zookeeper/templates/service-headless.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-zookeeper-headless
  labels:
    app: zookeeper
    chart: zookeeper-2.1.4
    release: druid
    heritage: Helm
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
    - name: client
      port: 2181
      targetPort: client
      protocol: TCP
    - name: election
      port: 3888
      targetPort: election
      protocol: TCP
    - name: server
      port: 2888
      targetPort: server
      protocol: TCP
  selector:
    app: zookeeper
    release: druid
---
# Source: druid/charts/zookeeper/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-zookeeper
  labels:
    app: zookeeper
    chart: zookeeper-2.1.4
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - name: client
      port: 2181
      protocol: TCP
      targetPort: client
  selector:
    app: zookeeper
    release: druid
---
# Source: druid/templates/broker/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-broker
  labels:
    app: druid
    chart: druid-0.3.0
    component: broker
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8082
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: druid
    release: druid
    component: broker
---
# Source: druid/templates/coordinator/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-coordinator
  labels:
    app: druid
    chart: druid-0.3.0
    component: coordinator
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8081
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: druid
    release: druid
    component: coordinator
---
# Source: druid/templates/historical/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-historical
  labels:
    app: druid
    chart: druid-0.3.0
    component: historical
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8083
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: druid
    release: druid
    component: historical
---
# Source: druid/templates/middleManager/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-middle-manager
  labels:
    app: druid
    chart: druid-0.3.0
    component: middle-manager
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8091
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: druid
    release: druid
    component: middle-manager
---
# Source: druid/templates/router/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: druid-router
  labels:
    app: druid
    chart: druid-0.3.0
    component: router
    release: druid
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8888
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: druid
    release: druid
    component: router
---
# Source: druid/templates/broker/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: druid-broker
  labels:
    app: druid
    chart: druid-0.3.0
    component: broker
    release: druid
    heritage: Helm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: druid
      release: druid
      component: broker
  template:
    metadata:
      labels:
        app: druid
        release: druid
        component: broker
    spec:
      containers:
        - name: druid
          image: "alvarogg777/druid:0.22.0"
          imagePullPolicy: IfNotPresent
          args: [ "broker" ]
          env:
          - name: DRUID_MAXDIRECTMEMORYSIZE
            value: "400m"
          - name: DRUID_XMS
            value: "512m"
          - name: DRUID_XMX
            value: "512m"
          - name: druid_processing_buffer_sizeBytes
            value: "50000000"
          - name: druid_processing_numMergeBuffers
            value: "2"
          - name: druid_processing_numThreads
            value: "1"
          envFrom:
            - configMapRef:
                name: druid
          ports:
            - name: http
              containerPort: 8082
              protocol: TCP
          livenessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8082
          readinessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8082
          resources:
            {}
---
# Source: druid/templates/coordinator/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: druid-coordinator
  labels:
    app: druid
    chart: druid-0.3.0
    component: coordinator
    release: druid
    heritage: Helm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: druid
      release: druid
      component: coordinator
  template:
    metadata:
      labels:
        app: druid
        release: druid
        component: coordinator
    spec:
      containers:
        - name: druid
          image: "alvarogg777/druid:0.22.0"
          imagePullPolicy: IfNotPresent
          args: [ "coordinator" ]
          env:
          - name: DRUID_XMS
            value: "256m"
          - name: DRUID_XMX
            value: "256m"
          envFrom:
            - configMapRef:
                name: druid
          ports:
            - name: http
              containerPort: 8081
              protocol: TCP
          livenessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8081
          readinessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8081
          resources:
            {}
          volumeMounts:
      volumes:
---
# Source: druid/templates/router/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: druid-router
  labels:
    app: druid
    chart: druid-0.3.0
    component: router
    release: druid
    heritage: Helm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: druid
      release: druid
      component: router
  template:
    metadata:
      labels:
        app: druid
        release: druid
        component: router
    spec:
      containers:
        - name: druid
          image: "alvarogg777/druid:0.22.0"
          imagePullPolicy: IfNotPresent
          args: [ "router" ]
          env:
          - name: DRUID_MAXDIRECTMEMORYSIZE
            value: "128m"
          - name: DRUID_XMS
            value: "128m"
          - name: DRUID_XMX
            value: "128m"
          envFrom:
            - configMapRef:
                name: druid 
          ports:
            - name: http
              containerPort: 8888
              protocol: TCP
          livenessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8888
          readinessProbe:
            initialDelaySeconds: 60
            httpGet:
              path: /status/health
              port: 8888
          resources:
            {}
---
# Source: druid/charts/zookeeper/templates/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: druid-zookeeper
  labels:
    app: zookeeper
    chart: zookeeper-2.1.4
    release: druid
    heritage: Helm
    component: server
spec:
  serviceName: druid-zookeeper-headless
  replicas: 3
  selector:
    matchLabels:
      app: zookeeper
      release: druid
      component: server
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: zookeeper
        release: druid
        component: server
    spec:
      terminationGracePeriodSeconds: 1800
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      containers:

        - name: zookeeper
          image: "zookeeper:3.5.9"
          imagePullPolicy: IfNotPresent
          command: 
             - "/bin/bash"
             - "-xec"
             - "/config-scripts/run"
          ports:
            - name: client
              containerPort: 2181
              protocol: TCP
            - name: election
              containerPort: 3888
              protocol: TCP
            - name: server
              containerPort: 2888
              protocol: TCP
          livenessProbe:
            exec:
              command:
                - sh
                - /config-scripts/ok
            initialDelaySeconds: 20
            periodSeconds: 30
            timeoutSeconds: 5
            failureThreshold: 2
            successThreshold: 1
          readinessProbe:
            exec:
              command:
                - sh
                - /config-scripts/ready
            initialDelaySeconds: 20
            periodSeconds: 30
            timeoutSeconds: 5
            failureThreshold: 2
            successThreshold: 1
          env:
            - name: ZK_REPLICAS
              value: "3"
            - name: JMXAUTH
              value: "false"
            - name: JMXDISABLE
              value: "false"
            - name: JMXPORT
              value: "1099"
            - name: JMXSSL
              value: "false"
            - name: ZK_HEAP_SIZE
              value: "512M"
            - name: ZK_SYNC_LIMIT
              value: "10"
            - name: ZK_TICK_TIME
              value: "2000"
            - name: ZOO_AUTOPURGE_PURGEINTERVAL
              value: "0"
            - name: ZOO_AUTOPURGE_SNAPRETAINCOUNT
              value: "3"
            - name: ZOO_INIT_LIMIT
              value: "5"
            - name: ZOO_MAX_CLIENT_CNXNS
              value: "60"
            - name: ZOO_PORT
              value: "2181"
            - name: ZOO_STANDALONE_ENABLED
              value: "false"
            - name: ZOO_TICK_TIME
              value: "2000"
          resources:
            {}
          volumeMounts:
            - name: data
              mountPath: /data
            - name: config
              mountPath: /config-scripts
      volumes:
        - name: config
          configMap:
            name: druid-zookeeper
            defaultMode: 0555
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "5Gi"
        storageClassName: "longhorn"
---
# Source: druid/templates/historical/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: druid
    chart: druid-0.3.0
    component: historical
    heritage: Helm
    release: druid
  name: druid-historical
spec:
  serviceName: druid-historical
  replicas: 1
  selector:
    matchLabels:
      app: druid
      release: druid
      component: historical
  template:
    metadata:
      labels:
        app: druid
        component: historical
        release: druid
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchLabels:
                  app: "druid"
                  release: "druid"
                  component: "historical"
      securityContext:
        fsGroup: 1000
      containers:
      - name: druid
        args: [ "historical" ]
        env:
        - name: DRUID_MAXDIRECTMEMORYSIZE
          value: "400m"
        - name: DRUID_XMS
          value: "512m"
        - name: DRUID_XMX
          value: "512m"
        - name: druid_processing_buffer_sizeBytes
          value: "50000000"
        - name: druid_processing_numMergeBuffers
          value: "2"
        - name: druid_processing_numThreads
          value: "1"
        envFrom:
          - configMapRef:
              name: druid
        resources:
            {}
        livenessProbe:
          initialDelaySeconds: 60
          httpGet:
            path: /status/health
            port: 8083
        readinessProbe:
          initialDelaySeconds: 60
          httpGet:
            path: /status/health
            port: 8083
        image: "alvarogg777/druid:0.22.0"
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 8083
          name: http
        volumeMounts:
        - mountPath: /opt/druid/var/druid/
          name: data
      volumes:
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - "ReadWriteOnce"
      storageClassName: "longhorn"
      resources:
        requests:
          storage: "4Gi"
---
# Source: druid/templates/middleManager/statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: druid
    chart: druid-0.3.0
    component: middle-manager
    heritage: Helm
    release: druid
  name: druid-middle-manager
spec:
  serviceName: druid-middle-manager
  replicas: 1
  selector:
    matchLabels:
      app: druid
      release: druid
      component: middle-manager
  template:
    metadata:
      labels:
        app: druid
        component: middle-manager
        release: druid
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchLabels:
                  app: "druid"
                  release: "druid"
                  component: "middle-manager"
      securityContext:
        fsGroup: 1000
      containers:
      - name: druid
        args: [ "middleManager" ]
        env:
        - name: DRUID_XMS
          value: "64m"
        - name: DRUID_XMX
          value: "64m"
        - name: druid_indexer_fork_property_druid_processing_buffer_sizeBytes
          value: "25000000"
        - name: druid_indexer_runner_javaOptsArray
          value: "[\"-server\", \"-Xms256m\", \"-Xmx256m\", \"-XX:MaxDirectMemorySize=300m\", \"-Duser.timezone=UTC\", \"-Dfile.encoding=UTF-8\", \"-XX:+ExitOnOutOfMemoryError\", \"-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager\"]"
        envFrom:
          - configMapRef:
              name: druid 
        resources:
            {}
        livenessProbe:
          initialDelaySeconds: 60
          httpGet:
            path: /status/health
            port: 8091
        readinessProbe:
          initialDelaySeconds: 60
          httpGet:
            path: /status/health
            port: 8091
        image: "alvarogg777/druid:0.22.0"
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 8091
          name: http
        volumeMounts:
        - mountPath: /opt/druid/var/druid/
          name: data
      volumes:
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
        - "ReadWriteOnce"
      storageClassName: "longhorn"
      resources:
        requests:
          storage: "4Gi"
asdf2014 commented 2 years ago

@alborotogarcia You are welcome. May I ask have you modified the address of PostgreSQL in configMap in your Kubernetes cluster

alborotogarcia commented 2 years ago

@asdf2014 yes, I have a postgres instance running on jdbc:postgresql://acid-minimal-cluster.storage:5432/druid and so all the druid tables were created as I posted above,.

henriquekops commented 2 years ago

I got into this same problem when creating a custom image for Druid v0.20.0.

You're using a custom Druid image as well, right?

...
containers:
        - name: druid
          image: "alvarogg777/druid:0.22.0"
...

Is your image Ubuntu based? Because if it is, you should check if druid.sh dependencies are fulfilled.

In my case, I had to apt install iproute2 for ip commands, otherwise I would get this exact same error:

...
...    Caused by: java.net.URISyntaxException: Expected hostname at index 7: http://:8081
...
alborotogarcia commented 2 years ago

You were absolutely right @henriquekops, though i used oracle jdk container instead ! I've just figured out running the container, and it complained wrt ip commands..

Thanks @asdf2014 @henriquekops !