CrunchyData / crunchy-containers

Containers for Managing PostgreSQL on Kubernetes by Crunchy Data
https://www.crunchydata.com/
Apache License 2.0
1.01k stars 328 forks source link

How to run a postgres container using the raw pgdata? #1400

Open clayrisser opened 2 years ago

clayrisser commented 2 years ago

I was running crunchydb postgres on Kubernetes, and I was backing up the underlying pgdata volume using a tool called velero.

I am trying to use the raw volume data to spin up a postgres locally on my machine so I can run a pgdump, however I am finding very difficult to start the postgres using the underlying pddata.

I am trying to use the docker image documented at the following link, because the HA container registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.4-4.7.3 seems overly complex to run as a single docker container and does not seem to be able to run outside of a kubernetes environment. https://access.crunchydata.com/documentation/crunchy-postgres-containers/4.3.1/container-specifications/crunchy-postgres

I am very confused because the /pgdata folder of the crunchy-postgres image creates a folder using the id of the running container, which is impossible to know before starting the container. I want to be able to simply mount the pgdata and have the database running in a single image locally for running pgdump.

I tried using the official postgres image, but that miserably fails because there are so many postgres modules that the crunchydb postgres database uses, and the official image does not support them.

I was thinking mabye the /recover path is where I am supposed to mount the pgdata, because it says that it is the "Volume used for Point In Time Recovery (PITR) during startup of the PostgreSQL database.". However, there is no documentation on how to use it at all.

In summary, how can I run a local single container postgres on my machine using the raw pgdata that I backed up in my kubernetes environment?

Your help is much appreciated.

clayrisser commented 2 years ago

@jkatz do you have any suggestions?

jkatz commented 2 years ago

I've been using kind lately to run PGO locally along with my databases. It's been pretty easy and efficient.

If you have the contents of the pgdata directory, you could probably get away with following creating a cluster from this talk but replacing the docker volume with the path to where your data directory resides.

clayrisser commented 2 years ago

What about the container id that is in the pgdata folder? /pgdata/<CONTAINER_ID>? Is there an environment variable that lets me specify the exact location of the pgdata?

clayrisser commented 2 years ago

@jkatz it doesn’t work because it’s expecting the pgdata to be in /pgdata/<CONTAINER_ID>

clayrisser commented 2 years ago

Ok I found the answer to this HERE. Hopefully it works.

https://github.com/CrunchyData/crunchy-containers/blob/master/bin/postgres_common/postgres/setenv.sh#L28

clayrisser commented 2 years ago

I’m unable to get this to work because the Postgres shutdown with no errors. Any suggestions?

sumanth2893 commented 2 years ago

I am trying to use the docker image https://hub.docker.com/r/crunchydata/crunchy-postgres and i am using below docker-compose and attach backup data in container.

version: "3.9"
services:
  postgres:
    image: crunchydata/crunchy-postgres:centos8-13.4-4.7.2
    environment: 
      PG_DATABASE: postgres
      PG_PRIMARY_PORT: 5432
      PG_MODE: primary
      MODE: postgres
      PG_USER: postgres
      PG_PASSWORD: postgres
      PG_PRIMARY_USER: postgres
      PG_PRIMARY_PASSWORD: postgres
      PG_ROOT_PASSWORD: postgres
      PGDATA_PATH_OVERRIDE: backup
      CRUNCHY_DEBUG: "TRUE"
    ports:
     - "5432:5432"
    volumes:
     - "./postgres:/pgdata/backup"

Output of this docker-compose

postgres container is not working

log

Thu Nov 11 15:48:53 UTC 2021 INFO: Image mode found: postgres
Thu Nov 11 15:48:53 UTC 2021 INFO: Starting in 'postgres' mode
Thu Nov 11 15:48:53 UTC 2021 INFO: Setting PGROOT to /usr/pgsql-13.
Thu Nov 11 15:48:53 UTC 2021 INFO: PG_CTL_START_TIMEOUT set at: 60
Thu Nov 11 15:48:53 UTC 2021 INFO: PG_CTL_STOP_TIMEOUT set at: 60
Thu Nov 11 15:48:53 UTC 2021 INFO: PG_CTL_PROMOTE_TIMEOUT set at: 60
Thu Nov 11 15:48:53 UTC 2021 INFO: Cleaning up the old postmaster.pid file..
Thu Nov 11 15:48:53 UTC 2021 INFO: User ID is set to uid=26(postgres) gid=26(postgres) groups=26(postgres).
Thu Nov 11 15:48:53 UTC 2021 INFO: Working on primary..
Thu Nov 11 15:48:53 UTC 2021 INFO: Initializing the primary database..
Thu Nov 11 15:48:53 UTC 2021 INFO: PGDATA already contains a database.
Thu Nov 11 15:48:53 UTC 2021 INFO: Setting ARCHIVE_MODE to off.
Thu Nov 11 15:48:53 UTC 2021 INFO: Setting ARCHIVE_TIMEOUT to 0.
Thu Nov 11 15:48:53 UTC 2021 INFO: Starting PostgreSQL..
2021-11-11 15:48:53.985 UTC [69] LOG:  pgaudit extension initialized
2021-11-11 15:48:53.986 UTC [69] WARNING:  pgnodemx: Kubernetes Downward API path /etc/podinfo does not exist: No such file or directory
2021-11-11 15:48:53.986 UTC [69] DETAIL:  disabling Kubernetes Downward API file system access
2021-11-11 15:48:53.996 UTC [69] LOG:  redirecting log output to logging collector process
2021-11-11 15:48:53.996 UTC [69] HINT:  Future log output will appear in directory "pg_log".
Thu Nov 11 15:48:54 UTC 2021 INFO: PostgreSQL is shutting down. Exiting..

what i am missing here ?