zalando / spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker
Apache License 2.0
1.52k stars 374 forks source link

feat(wale-clone): Added data dir permission change (0700) during cloning #920

Open thedatabaseme opened 10 months ago

thedatabaseme commented 10 months ago

In the launch.sh script, the permissions of PGROOT and PGDATA directories are already set. In situations where you clone an instance from a source with wrong permissions (e.g. 0775) on the PGDATA directory, the clone will fail during recovery, since the permissions will be set wrong.

The error will be something like this:

data directory "/home/postgres/pgdata/pgroot/data" has invalid permissions
DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).

A reason why the permissions are updated by Kubernetes itself can be, if there is a kubelet restart on a worker node. This will lead to a permission change to allow full access to whatever fsGroup is specified. This is done recursively on the root folder. So the result will be 0770 permissions. See here for more details. The same happens also during kubelet restarts.

This PR includes setting the permissions to 0700 of the PGDATA directory after the backup-fetch has been done. The actual instance recovery will therefore not fail.

I hope you find this helpful.

Kind regards Philip