bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.84k stars 9.13k forks source link

[bitnami][clickhouse] User's custom init/start script does not correctly initialized when ClickHouse start #29545

Open Trungtin1011 opened 2 hours ago

Trungtin1011 commented 2 hours ago

Name and Version

bitnami/clickhouse 24.8.4-debian-12-r0

What architecture are you using?

amd64

What steps will reproduce the bug?

I am trying to create a ClickHouse cluster with 1 Shard and 3 Replias. My deployment requires some bootstrap script to be run on the first start (create a database in advanced).

This is the content of my 00_default_overrides.xml file

<clickhouse>
  <!-- Macros -->
  <macros>
    <shard from_env="CLICKHOUSE_SHARD_ID"></shard>
    <replica from_env="CLICKHOUSE_REPLICA_ID"></replica>
    <layer>chi</layer>
  </macros>
  <!-- Log Level -->
  <logger>
    <level>information</level>
  </logger>
  <!-- Cluster configuration - Any update of the shards and replicas requires helm upgrade -->
  <remote_servers>
    <default>
      <shard>
          <replica>
              <host>chi-shard0-0.chi-headless.chi.svc.cluster.local</host>
              <port>9000</port>
              <user from_env="CLICKHOUSE_ADMIN_USER"></user>
              <password from_env="CLICKHOUSE_ADMIN_PASSWORD"></password>
          </replica>
          <replica>
              <host>chi-shard0-1.chi-headless.chi.svc.cluster.local</host>
              <port>9000</port>
              <user from_env="CLICKHOUSE_ADMIN_USER"></user>
              <password from_env="CLICKHOUSE_ADMIN_PASSWORD"></password>
          </replica>
          <replica>
              <host>chi-shard0-2.chi-headless.chi.svc.cluster.local</host>
              <port>9000</port>
              <user from_env="CLICKHOUSE_ADMIN_USER"></user>
              <password from_env="CLICKHOUSE_ADMIN_PASSWORD"></password>
          </replica>
      </shard>
    </default>
  </remote_servers>
  <!-- keeper configuration -->
  <keeper_server>

    <tcp_port>2181</tcp_port>
    <server_id from_env="KEEPER_SERVER_ID"></server_id>
    <log_storage_path>/bitnami/clickhouse/keeper/coordination/log</log_storage_path>
    <snapshot_storage_path>/bitnami/clickhouse/keeper/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
    <server>
      <id>0</id>
      <hostname from_env="KEEPER_NODE_0"></hostname>
      <port>9444</port>
    </server>
    <server>
      <id>1</id>
      <hostname from_env="KEEPER_NODE_1"></hostname>
      <port>9444</port>
    </server>
    <server>
      <id>2</id>
      <hostname from_env="KEEPER_NODE_2"></hostname>
      <port>9444</port>
    </server>
    </raft_configuration>
  </keeper_server>
  <!-- Zookeeper configuration -->
  <zookeeper>
    <node>
      <host from_env="KEEPER_NODE_0"></host>
      <port>2181</port>
    </node>
    <node>
      <host from_env="KEEPER_NODE_1"></host>
      <port>2181</port>
    </node>
    <node>
      <host from_env="KEEPER_NODE_2"></host>
      <port>2181</port>
    </node>
  </zookeeper>
</clickhouse>

I've created an init script to create a new database and mounted it to /docker-entrypoint-initdb.d

#!/bin/bash
set -e

clickhouse client --password $CLICKHOUSE_ADMIN_PASSWORD -n <<-EOSQL
  CREATE DATABASE IF NOT EXISTS test ON CLUSTER default;
EOSQL

When apply the config and start the ClickHouse server with /script/setup.sh command, it threw some errors and my container crashed after some reboots.

Are you using any custom parameters or values?

Here is my values.yaml config:

fullnameOverride: "chi"
shards: 1
replicaCount: 3
auth:
  username: default
  password: ......
logLevel: information
persistence:
  size: 40Gi
initdbScriptsSecret: extra-init-scripts
extraEnvVars:
  - name: DISABLE_WELCOME_MESSAGE
    value: "true"
keeper:
  enabled: true
containerSecurityContext:
  enabled: true
  allowPrivilegeEscalation: true
  privileged: true
  runAsGroup: 0
  readOnlyRootFilesystem: false

pdb:
  create: false
zookeeper:
  enabled: false
networkPolicy:
  enabled: false

What is the expected behavior?

  1. ClickHouse is running
  2. The logs are OK
  3. A new database named test appear in the Database

What do you see instead?

  1. ClickHouse container crash.
  2. Logs with error messsages
    clickhouse 04:18:43.18 INFO  ==> ** Starting ClickHouse setup **
    clickhouse 04:18:43.21 INFO  ==> Copying mounted configuration from /bitnami/clickhouse/etc
    cp: -r not specified; omitting directory '/bitnami/clickhouse/etc/conf.d/default/..data'
    clickhouse 04:18:43.22 INFO  ==> Starting ClickHouse in background
    clickhouse 04:19:43.26 INFO  ==> ClickHouse started successfully
    clickhouse 04:19:43.26 INFO  ==> Loading user's custom files from /docker-entrypoint-initdb.d
    clickhouse 04:19:43.27 WARN  ==> Sourcing /docker-entrypoint-initdb.d/platform_init_script.sh as it is not executable by the current user, any error may cause initialization to fail
    Received exception from server (version 24.8.4):
    Code: 999. DB::Exception: Received from localhost:9000. Coordination::Exception. Coordination::Exception: All connection tries failed while connecting to ZooKeeper. nodes: 10.10.1.130:2181, 10.10.2.45:2181, 10.10.2.133:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.1.130:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.45:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.133:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.1.130:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.45:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.133:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.1.130:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.45:2181
    Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 24.8.4.13 (official build)), 10.10.2.133:2181
    . (KEEPER_EXCEPTION)
    (query: CREATE DATABASE IF NOT EXISTS test ON CLUSTER default;)

Additional information

The container's logs have 2 different error messages:

  1. cp: -r not specified; omitting directory '/bitnami/clickhouse/etc/conf.d/default/..data <- This is because the bitnami/clickhouse image did not include -r option when executing cp command.
  2. Code: 999. DB::Exception: Received from localhost:9000 <- This is because the bitnami/clickhouse image only listen to localhost when starting in background for running custom script

I'm going to raise a PR for fixing these errors. Please review.

I've test the code changes, which works as expected

clickhouse 02:17:37.92 INFO  ==> ** Starting ClickHouse setup **
clickhouse 02:17:37.95 INFO  ==> Copying mounted configuration from /bitnami/clickhouse/etc
clickhouse 02:17:37.97 INFO  ==> Starting ClickHouse in background
clickhouse 02:17:42.98 INFO  ==> ClickHouse started successfully
clickhouse 02:17:42.98 INFO  ==> Loading user's custom files from /docker-entrypoint-initdb.d
clickhouse 02:17:42.98 WARN  ==> Sourcing /docker-entrypoint-initdb.d/platform_init_script.sh as it is not executable by the current user, any error may cause initialization to fail
chi-shard0-0.chi-headless.chi.svc.cluster.local 9000    0               2       0
chi-shard0-2.chi-headless.chi.svc.cluster.local 9000    0               1       0
chi-shard0-1.chi-headless.chi.svc.cluster.local 9000    0               0       0

clickhouse 02:17:59.74 INFO  ==> ** ClickHouse setup finished! **
clickhouse 02:17:59.75 INFO  ==> ** Starting ClickHouse **
Processing configuration file '/opt/bitnami/clickhouse/etc/config.xml'.
Merging configuration file '/opt/bitnami/clickhouse/etc/conf.d/00_default_overrides.xml'.
Merging configuration file '/opt/bitnami/clickhouse/etc/conf.d/platform_extra_overrides.xml'.
Trungtin1011 commented 2 hours ago

Related issue since Jun 19, 2023: Issue #38150

Trungtin1011 commented 2 hours ago

My open PR: https://github.com/bitnami/containers/pull/72646