apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.45k stars 3.7k forks source link

Deep storage on kubernetes #12087

Closed alborotogarcia closed 2 years ago

alborotogarcia commented 2 years ago

Coordinator always restarts when I set minio/hdfs for deep storage

Affected Version

v0.22.1

Description

I'm new to druid, I see that in order to persist segments deep storage is needed.

As the docs say, It is needed to enable the "druid-s3-extensions" or "druid-hdfs-storage" extensions in the loadlist, so that it get sets from configmap.

In case of hdfs as deep storage it is also needed the core-site.xml and hdfs-site.xml but the coordinator pod always gets restarted with no trace.

Please include as much detailed information about the problem as possible.

I set core-site.xml and hdfs-site.xml as a configmap same as my hadoop deployment

apiVersion: v1
kind: ConfigMap
metadata:
  name: hadoop
data:
  core-site.xml: |
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
      <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hadoop-hdfs-nn.hdfs:8020/</value>
            <description>NameNode URI</description>
        </property>
    </configuration>

  hdfs-site.xml: |
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration><property>
          <name>dfs.webhdfs.enabled</name>
          <value>true</value>
      </property><property>
        <name>dfs.datanode.use.datanode.hostname</name>
        <value>true</value>
      </property>

      <property>
        <name>dfs.client.use.datanode.hostname</name>
        <value>true</value>
      </property>

      <property>
        <name>dfs.replication</name>
          <value>3</value>
      </property>

      <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///root/hdfs/datanode</value>
        <description>DataNode directory</description>
      </property>

      <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///root/hdfs/namenode</value>
        <description>NameNode directory for namespace and transaction logs storage.</description>
      </property>

      <property>
        <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
        <value>false</value>
      </property>

      <!-- Bind to all interfaces -->
      <property>
        <name>dfs.namenode.rpc-bind-host</name>
        <value>0.0.0.0</value>
      </property>
      <property>
        <name>dfs.namenode.servicerpc-bind-host</name>
        <value>0.0.0.0</value>
      </property>
      <!-- /Bind to all interfaces -->

    </configuration>

So it gets mounted on the conmon subpath

          volumeMounts:
            - name: hadoop-config
              mountPath: /opt/druid/conf/druid/cluster/_common/core-site.xml
              subPath: core-site.xml
            - name: hadoop-config
              mountPath: /opt/druid/conf/druid/cluster/_common/hdfs-site.xml
              subPath: hdfs-site.xml
      volumes:
      - name: hadoop-config
        configMap:
          name: hadoop

I tried creating my /druid root folder on hdfs just in case, though no difference so far..

~     k get svc -nhdfs
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
hadoop-hdfs-dn   ClusterIP   None            <none>        9000/TCP,9864/TCP,8020/TCP   59m
hadoop-hdfs-nn   ClusterIP   None            <none>        9000/TCP,9870/TCP,8020/TCP   59m
hadoop-yarn-nm   ClusterIP   None            <none>        8088/TCP,8082/TCP,8042/TCP   59m
hadoop-yarn-rm   ClusterIP   None            <none>        8088/TCP                     59m
hadoop-yarn-ui   ClusterIP   10.43.132.233   <none>        8088/TCP                     59m

 root@hadoop-hdfs-nn-0:/#  hdfs dfs -ls /
Found 1 items
drwxrwxrwx   - root supergroup          0 2021-12-21 13:21 /druid

Here is the coordinator trace..

+ druid druid-coordinator-6c8b48f5cd-nngjc › druid
druid druid-coordinator-6c8b48f5cd-nngjc druid 2021-12-21T14:47:19+0100 startup service coordinator
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.host=10.42.23.164 in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.storage.type=hdfs in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.metadata.storage.connector.connectURI=jdbc:postgresql://acid-minimal-cluster.storage:5432/druid in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.extensions.loadList=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global","postgresql-metadata-storage","druid-kafka-indexing-service","druid-kafka-extraction-namespace","druid-avro-extensions","druid-basic-security","druid-s3-extensions","druid-hdfs-storage"] in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.indexer.logs.type=file in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.indexer.logs.directory=/opt/data/indexing-logs in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.zk.service.host=druid-zookeeper-headless:2181 in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.metadata.storage.type=postgresql in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.metadata.storage.connector.user=xxxxxxxx in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.metadata.storage.connector.password=xxxxxxxxxxxxxxx in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid druid-coordinator-6c8b48f5cd-nngjc druid Setting druid.storage.storageDirectory=hdfs://hadoop-hdfs-nn.hdfs:8020/druid in /tmp/conf/druid/cluster/master/coordinator-overlord/runtime.properties
- druid druid-coordinator-6c8b48f5cd-nngjc › druid 

After a while it gets restarted

Please let me know If I there's more info I can provide, Sorry for the long issue !

fhennig commented 2 years ago

Hey, there is also local deep storage, maybe you can use that instead: https://druid.apache.org/docs/latest/dependencies/deep-storage.html#local-mount

Fro what you wrote it seemed like you were unaware of that, maybe that helps you

alborotogarcia commented 2 years ago

Hey @fhennig thanks for the reply, yes I am aware of that however, local deep storage seems to turn into problems between historical and middle manager see this issue