ccakes / nomad-pgsql-patroni

Simple container for running Postgres HA on Nomad
The Unlicense
68 stars 12 forks source link

docker-compose setup vs. nomad setup #5

Open neuroserve opened 2 years ago

neuroserve commented 2 years ago

Hello,

I'm trying to deploy into nomad but I get this error:

root@nom-8debe744-d8dc:~# docker logs -f 98a788b6ca61
2022-02-21 09:18:44,863 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-02-21 09:18:44,872 INFO: Deregister service postgres/pg-nom-8debe744-d8dc
2022-02-21 09:18:44,885 INFO: Lock owner: None; I am pg-nom-8debe744-d8dc
2022-02-21 09:18:44,919 INFO: Deregister service postgres/pg-nom-8debe744-d8dc
2022-02-21 09:18:44,937 INFO: trying to bootstrap a new cluster
2022-02-21 09:18:44,937 INFO: Running custom bootstrap script: /usr/local/bin/docker-initdb.sh
2022-02-21 09:18:44,947 INFO: Lock owner: None; I am pg-nom-8debe744-d8dc
2022-02-21 09:18:44,947 INFO: not healthy enough for leader race
2022-02-21 09:18:44,950 WARNING: Could not register service: unknown role type uninitialized
2022-02-21 09:18:44,988 INFO: bootstrap in progress
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /var/lib/postgresql/data -l logfile start

waiting for server to start....2022-02-21 09:18:47.569 UTC [40] LOG:  starting PostgreSQL 14.1 (Debian 14.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2022-02-21 09:18:47.616 UTC [40] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-02-21 09:18:47.974 UTC [41] LOG:  database system was shut down at 2022-02-21 09:18:46 UTC
2022-02-21 09:18:47.986 UTC [40] LOG:  database system is ready to accept connections
 done
server started

/usr/local/bin/docker-initdb.sh: running /docker-entrypoint-initdb.d/000_shared_libs.sh

/usr/local/bin/docker-initdb.sh: running /docker-entrypoint-initdb.d/001_initdb_postgis.sh
CREATE DATABASE
UPDATE 1
Loading PostGIS extensions into template_postgis
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
2022-02-21 09:18:54,942 INFO: Lock owner: None; I am pg-nom-8debe744-d8dc
2022-02-21 09:18:54,942 INFO: not healthy enough for leader race
2022-02-21 09:18:54,946 INFO: bootstrap in progress
CREATE EXTENSION
Loading PostGIS extensions into 
2022-02-21 09:19:04,945 INFO: Lock owner: None; I am pg-nom-8debe744-d8dc
2022-02-21 09:19:04,945 INFO: not healthy enough for leader race
2022-02-21 09:19:04,950 INFO: bootstrap in progress
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION

2022-02-21 09:19:11.767 UTC [40] LOG:  received fast shutdown request
waiting for server to shut down....2022-02-21 09:19:11.895 UTC [40] LOG:  aborting any active transactions
2022-02-21 09:19:11.900 UTC [40] LOG:  background worker "logical replication launcher" (PID 47) exited with exit code 1
2022-02-21 09:19:11.907 UTC [42] LOG:  shutting down
..2022-02-21 09:19:14.051 UTC [40] LOG:  database system is shut down
 done
server stopped

PostgreSQL init process complete; ready for start up.

2022-02-21 09:19:14,080 ERROR: unable to restore configuration files from backup
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/patroni/postgresql/config.py", line 383, in restore_configuration_files
    open(config_file, 'w').close()
FileNotFoundError: [Errno 2] No such file or directory: '/alloc/data/pg_ident.conf'
2022-02-21 09:19:14,085 ERROR: Exception during execution of long running task bootstrap
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/patroni/async_executor.py", line 97, in run
    wakeup = func(*args) if args else func()
  File "/usr/local/lib/python3.9/dist-packages/patroni/postgresql/bootstrap.py", line 294, in bootstrap
    return do_initialize(config.get(method)) and self._postgresql.config.append_pg_hba(pg_hba) \
  File "/usr/local/lib/python3.9/dist-packages/patroni/postgresql/config.py", line 424, in append_pg_hba
    with open(self._pg_hba_conf, 'a') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/alloc/data/pg_hba.conf'
2022-02-21 09:19:14,092 INFO: removing initialize key after failed attempt to bootstrap the cluster
2022-02-21 09:19:14,105 INFO: Deregister service postgres/pg-nom-8debe744-d8dc
2022-02-21 09:19:14,408 INFO: Deregister service postgres/pg-nom-8debe744-d8dc
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.9/dist-packages/patroni/__init__.py", line 140, in patroni_main
    abstract_main(Patroni, schema)
  File "/usr/local/lib/python3.9/dist-packages/patroni/daemon.py", line 100, in abstract_main
    controller.run()
  File "/usr/local/lib/python3.9/dist-packages/patroni/__init__.py", line 110, in run
    super(Patroni, self).run()
  File "/usr/local/lib/python3.9/dist-packages/patroni/daemon.py", line 59, in run
    self._run_cycle()
  File "/usr/local/lib/python3.9/dist-packages/patroni/__init__.py", line 113, in _run_cycle
    logger.info(self.ha.run_cycle())
  File "/usr/local/lib/python3.9/dist-packages/patroni/ha.py", line 1502, in run_cycle
    info = self._run_cycle()
  File "/usr/local/lib/python3.9/dist-packages/patroni/ha.py", line 1376, in _run_cycle
    return self.post_bootstrap()
  File "/usr/local/lib/python3.9/dist-packages/patroni/ha.py", line 1268, in post_bootstrap
    self.cancel_initialization()
  File "/usr/local/lib/python3.9/dist-packages/patroni/ha.py", line 1261, in cancel_initialization
    raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'

I used the same bootstrap configuration as in the docker-compose example. With docker-compose the deployment works OK (although I have no idea yet how to create a HA cluster with it - neither with docker compose nor with nomad).

Is this (restoring the configuration files from backup) a configuration option in the bootstrap config? Are there even configuration files to restore while setting up a new cluster?

Thanks for any hints. Töns

Izerrion commented 1 year ago

Don't forget to pass the env

  env {
    PGDATA="/alloc/data"
  }

It is omitted in the nomad job example.

Izerrion commented 1 year ago

@neuroserve

There are still a lot of issues with the original job, I had to create a lot of other hardcodes to make it work. Have you managed to correctly boostrap the working patroni cluster inside nomad?

neuroserve commented 1 year ago

Haven't checked it yet. Am setting up a new environment right now...