dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

helmchart create-database error "unable to connect to DB, tried default DB names: postgres,defaultdb" #27

Open dhiaayachi opened 4 weeks ago

dhiaayachi commented 4 weeks ago

Expected Behavior

I expected it to use the database name that's configured in the helm chart.

Here is the init container that's created from the helm chart

- command:
    - temporal-sql-tool
    - create-database
    env:
    - name: SQL_PLUGIN
      value: postgres12
    - name: SQL_HOST
      value: 10.63.7.94
    - name: SQL_PORT
      value: "5432"
    - name: SQL_DATABASE
      value: citus
    - name: SQL_USER
      value: citus
    - name: SQL_PASSWORD
      valueFrom:
        secretKeyRef:
          key: password
          name: temporal-default-store
    image: temporalio/admin-tools:1.24.2-tctl-1.18.1-cli-0.13.0
    imagePullPolicy: IfNotPresent
    name: create-default-store
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-lxgvw
      readOnly: true

Here is the yaml used for the helm chart:

values:
    cassandra:
      enabled: false
    prometheus:
      enabled: false
    elasticsearch:
      enabled: false
    grafana:
      enabled: false
    server:
      config:
        persistence:
          default:
            driver: "sql"
            sql:
              driver: "postgres12"
              host: 0.0.0.0 # omited
              port: 5432
              database: citus
              user: citus
              password: blah #omited
              maxConns: 20
              maxConnLifetime: "1h"

          visibility:
            driver: "sql"

            sql:
              driver: "postgres12"
              host: 0.0.0.0 #omited
              port: 5432
              database: citus
              user: citus
              password: blah #omited
              maxConns: 20
              maxConnLifetime: "1h"

Actual Behavior

I expected it to use the citus database referenced.

> kubectl logs temporal-schema-ttjv2 -c create-default-store
2024-08-23T10:04:50.384Z    ERROR   Unable to create SQL database.  {"error": "unable to connect to DB, tried default DB names: postgres,defaultdb, errors: [pq: no pg_hba.conf entry for host \"fd40:6eea:20:81c1:8220:100:a45:1008\", user \"citus\", database \"postgres\", no encryption pq: no pg_hba.conf entry for host \"fd40:6eea:20:81c1:8220:100:a45:1008\", user \"citus\", database \"defaultdb\", no encryption]", "logging-call-at": "handler.go:94"}
dhiaayachi commented 2 weeks ago

Thank you for reporting this issue.

The error message indicates that the temporal-sql-tool is unable to connect to the PostgreSQL database because it cannot find an entry in the pg_hba.conf file for the provided host and user. This file controls how the PostgreSQL server accepts connections.

Could you please confirm the following:

  1. pg_hba.conf contents: Can you share the contents of your pg_hba.conf file? This will help us understand what connection rules are in place.
  2. Firewall rules: Are there any firewall rules in place that might be blocking the connection from the init container to the PostgreSQL server?

Once we have this information, we can try to identify the specific issue and provide a solution.

dhiaayachi commented 2 weeks ago

Thanks for reporting this issue. The temporal-sql-tool command expects the SQL_DATABASE environment variable to be set with the name of the database to use. In the provided init container, the SQL_DATABASE variable is set to citus, however, the temporal-sql-tool command is trying to connect to postgres and defaultdb databases. The reason why is because when the temporal-sql-tool command is executed within the init container, it does not have any knowledge of the configuration provided in the Helm chart. The environment variables that you have set will be used only when the temporal-server command is executed.

To fix this, you can modify the init container to use the SQL_DATABASE environment variable instead of hard-coding the database names.

- command:
    - temporal-sql-tool
    - create-database
  env:
    - name: SQL_PLUGIN
      value: postgres12
    - name: SQL_HOST
      value: 10.63.7.94
    - name: SQL_PORT
      value: "5432"
    - name: SQL_DATABASE
      value: citus
    - name: SQL_USER
      value: citus
    - name: SQL_PASSWORD
      valueFrom:
        secretKeyRef:
          key: password
          name: temporal-default-store
  image: temporalio/admin-tools:1.24.2-tctl-1.18.1-cli-0.13.0
  imagePullPolicy: IfNotPresent
  name: create-default-store
  resources: {}
  terminationMessagePath: /dev/termination-log
  terminationMessagePolicy: File
  volumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: kube-api-access-lxgvw
    readOnly: true

This will ensure that the temporal-sql-tool command uses the citus database as expected.

For more information on how to use temporal-sql-tool, please refer to the documentation: https://docs.temporal.io/references/commands#temporal-sql-tool

dhiaayachi commented 2 weeks ago

Thanks for reporting this issue!

The logs show that the Temporal SQL tool is unable to connect to the database and is trying to use the default names, postgres and defaultdb.

The issue is that in your Helm chart, you are not providing the SQL_DATABASE value to the init container. This means that the container is trying to connect to a database named postgres, and since the connection fails, the tool tries the default names postgres and defaultdb.

To resolve this issue, update your Helm chart to provide the SQL_DATABASE value to the init container as follows:

- command:
    - temporal-sql-tool
    - create-database
  env:
    - name: SQL_PLUGIN
      value: postgres12
    - name: SQL_HOST
      value: 10.63.7.94
    - name: SQL_PORT
      value: "5432"
    - name: SQL_DATABASE
      value: citus  # This is the value you need to provide. 
    - name: SQL_USER
      value: citus
    - name: SQL_PASSWORD
      valueFrom:
        secretKeyRef:
          key: password
          name: temporal-default-store
  image: temporalio/admin-tools:1.24.2-tctl-1.18.1-cli-0.13.0
  imagePullPolicy: IfNotPresent
  name: create-default-store
  resources: {}
  terminationMessagePath: /dev/termination-log
  terminationMessagePolicy: File
  volumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: kube-api-access-lxgvw
    readOnly: true

With this change, the init container will now be able to connect to the citus database as expected.

dhiaayachi commented 2 weeks ago

Thanks for reporting the issue. It appears that you are missing the connection information for the SQL Host. The SQL Host in the helm chart is not configured, but instead is referenced as 0.0.0.0 in the helm values. To resolve the issue, you'll need to replace 0.0.0.0 with the correct IP address of your Postgres host (which is 10.63.7.94).

Here's how you can update the helm values file:

  1. Edit the helm values file:
    kubectl edit configmap -n your-namespace temporal-server -o yaml 
  2. Change the host parameter in the persistence and visibility configuration to 10.63.7.94:
values:
    cassandra:
      enabled: false
    prometheus:
      enabled: false
    elasticsearch:
      enabled: false
    grafana:
      enabled: false
    server:
      config:
        persistence:
          default:
            driver: "sql"
            sql:
              driver: "postgres12"
              host: 10.63.7.94 # replaced 0.0.0.0 with the correct IP address
              port: 5432
              database: citus
              user: citus
              password: blah #omited
              maxConns: 20
              maxConnLifetime: "1h"

          visibility:
            driver: "sql"

            sql:
              driver: "postgres12"
              host: 10.63.7.94 # replaced 0.0.0.0 with the correct IP address
              port: 5432
              database: citus
              user: citus
              password: blah #omited
              maxConns: 20
              maxConnLifetime: "1h"
  1. Save the changes to the file.

After these changes are applied, your Temporal server should be able to connect to the Postgres database.