timescale / timescaledb-docker-ha

Create Docker images containing TimescaleDB, Patroni to be used by developers and Kubernetes.
Apache License 2.0
162 stars 44 forks source link

Image not fully compatible with standard postgres images #272

Closed ryanobjc closed 8 months ago

ryanobjc commented 2 years ago

Hey folks, the primary documentation suggests that this image is a derivation from the official postgres images, but it does not appear to be true. It's not technically true (although looks like you're attempting to replicate much of the same functionality), and it's not functionally true: the timescaledb-ha image does not work with kubegres.io.

It would be nice if the image really was fully compatible with the postgres image one, and two kubegres.io in specific. Looks like the main issue is you are changing the run user in Docker to 'postgres' rather than leaving it as 'root' which makes it difficult for secondary tooling to layer on if it requires root permissions to do things (such as create directories in mounted filesystems that didnt exist at image build time).

ryanobjc commented 2 years ago

Additionally looks like this docker image doesn't declare a volume to mount AND puts the pg data in a completely different directory than the official postgres image, eg: PGDATA=/home/postgres/pgdata/data

Whereas the official postgres image puts volumes here: VOLUME /var/lib/postgresql/data

I was hoping to use this image to run timescale, but looks like I will have to build my own image from scratch instead to fix these issues, because kubegres is expecting things in the correct directories.

hiendaovinh commented 2 years ago

Agree. Although we could customize the PGDATA env, the permission is incorrect. Resulting in failure to mount PGDATA to the host.

hiendaovinh commented 2 years ago

@ryanobjc Do you have your image public somewhere?

nagaem commented 2 years ago

I ran into this when trying to switch out the standard timescaledb image for this one in my docker-compose file. Trying to do so generates a permissions error:

chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

It might just be my inexperience, but it was non-obvious to me that the way to use this image was to change my volume mount path from the standard /var/lib/postgresql/data to /home/postgres/pgdata/data and drop my PGDATA env variable declaration from the docker compose file. If making the image compatible with standard postgres images is not an option, maybe this should be documented somewhere?

VMM-Mtech commented 1 year ago

Before I saw this issue, I added a comment to #290. Both are related to the fact that different uid is used, and it cannot be given in env variable, because it has been hardcoded to several commands in Dockerfile.

danclimasevschi commented 1 year ago

I ran into this when trying to switch out the standard timescaledb image for this one in my docker-compose file. Trying to do so generates a permissions error:

chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

It might just be my inexperience, but it was non-obvious to me that the way to use this image was to change my volume mount path from the standard /var/lib/postgresql/data to /home/postgres/pgdata/data and drop my PGDATA env variable declaration from the docker compose file. If making the image compatible with standard postgres images is not an option, maybe this should be documented somewhere?

Hi, this is quite a common use-case: the community using standard timescaledb image is quite large (compare dockerhub pulls for both images). They are all most likely going to migrate to timescaledb-ha at some point, so documenting the whole process would be a huge relief(and fixing the permissions along the way).

Thanks, Dan

graveland commented 1 year ago

Let me know if the latest image works for you now!

My testing with --user 1234:1111 -v `pwd`/db:/home/pgdata -ePGDATA=/home/pgdata <image>, stopping, and restarting with different --user values created a database, and then successfully restarted. Hopefully this will help?

The postgresql user is still always postgres, but nss-wrapper tricks postgresql into thinking ownership is correct. You should have an easier time using this image and just modifying PGDATA and --user. Specifically try to avoid using locations in the default postgres user home directory /home/postgres when you're using either a different user, or a different PGDATA value.

hiendaovinh commented 1 year ago

Does it work for pg14? @graveland I'm using timescale/timescaledb:latest-pg14 with this env, could you please document the process of moving from timescaledb:latest-pg14 to timescaledb-ha:pg14-latest?

environment:
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_MULTIPLE_DATABASES: ${POSTGRES_DB}
      POSTGRES_PORT: ${POSTGRES_PORT}
      PGDATA: /var/lib/postgresql/data
    volumes:
      - ./pgdata/production/data:/var/lib/postgresql/data
      - ./config/postgres/initdb.d:/docker-entrypoint-initdb.d
hexxone commented 9 months ago

Yes I am also trying to set-up HA locally for some customer evaluation, but I am verry confused about this Image/repo. Some more documentation about the recommended use-cases, and differences to the "default" image would be highly appreciated. For me its also unclear why there are multiple entrypoints? Are we supposed to use the timescale-one for actual patroni usage?

Thanks in advance.

graveland commented 8 months ago

You should specifically avoid using /var/lib/postgresql/data as the data mount point. If you pick a different one and also set PGDATA to the new location, it should hopefully resolve any permissions issues. This repo doesn't refer to POSTGRES_MULTIPLE_DATABASES at all, but does have POSTGRES_DB instead.

This image contains all sorts of extra things that a person might want for their postgresql install: pgbackrest, patroni, postgis, etc. There are several ways to run it and the different entrypoints accommodate them. Generally just docker run should work but won't use patroni. The timescaledb_entrypoint.sh runs patroni, which then runs postgresql. The pgbackrest_entrypoint.sh is used when running pgbackrest from the same container image. If you're running this in kubernetes, one container in the pod would be running the patroni version, with another container in the pod running pgbackrest for example.

I'm not sure how much time anybody will have to document things more unfortunately, but please re-open or create a new issue for more questions. We'll try to get to it quicker next time.