ccakes / nomad-pgsql-patroni

Simple container for running Postgres HA on Nomad
The Unlicense
68 stars 12 forks source link

Cannot connect to Consul #7

Closed const-tmp closed 1 year ago

const-tmp commented 1 year ago

Patroni job spec

job "patroni" {
  type        = "service"
  datacenters = ["dc1"]

  group "group" {
    count = 3

    spread {
      attribute = "${node.unique.id}"
    }

    network {
      port api { to = 8080 }
      port pg { to = 5432 }
    }

    task "db" {
      driver = "docker"

      template {
        data        = <<EOL
scope: postgres
name: pg-{{env "node.unique.name"}}
namespace: /nomad

restapi:
  listen: 0.0.0.0:{{env "NOMAD_PORT_api"}}
  connect_address: {{env "NOMAD_ADDR_api"}}

consul:
  host: localhost
#  register_service: true

# bootstrap config
EOL
        destination = "/secrets/patroni.yml"
      }

      config {
        image = "ghcr.io/ccakes/nomad-pgsql-patroni:15.1-2.tsdb_gis"
        ports = ["api", "pg"]
      }
    }
  }
}

Error:

2023-04-20 12:09:46,763 INFO: waiting on consul
2023-04-20 12:10:00,812 WARNING: Retry got exception: HTTPConnectionPool(host='127.0.0.1', port=8500): Max retries exceeded with url: /v1/session/create (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6b23b10040>: Failed to establish a new connection: [Errno 111] Connection refused'))
2023-04-20 12:10:00,812 ERROR: refresh_session
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.9/dist-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/usr/local/lib/python3.9/dist-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

Consul is working, we can prove it with test job:

job "test" {
  type        = "service"

  group "test" {
    task "test" {
      driver = "exec"

      template {
        data        = <<EOL
while true
do
  curl localhost:8500/v1/catalog/nodes
  echo
  sleep 5
done
EOL
        destination = "local/test.sh"
      }

      config {
        command = "bash"
        args    = ["local/test.sh"]
      }
    }
  }
}

test job logs

[{"ID":"a6f3d88a-1881-cd3f-facf-d1a62be96b94","Node":...
...

What am I doing wrong?

ccakes commented 1 year ago

The difference between your test job and the patroni one is the executor.

I guess you have Consul running on the host, to the exec job localhost points to the host loopback but to the docker job, localhost points to the container loopback. If you're using Consul for DNS and Docker is configured for that, change your patroni template to look like this:

consul:
  host: consul.service.consul