Open FawenYo opened 1 year ago
Hi @FawenYo,
Did you try to resolve this warning WARNING: password file "/opt/bitnami/repmgr/secrets/.pgpass" has group or world access; permissions should be u=rw (0600) or less
?
If you resolved it (warning), If you have found it, is the error the same?
Hi @FawenYo,
Did you try to resolve this warning
WARNING: password file "/opt/bitnami/repmgr/secrets/.pgpass" has group or world access; permissions should be u=rw (0600) or less
? If you resolved it (warning), If you have found it, is the error the same?
Hi @Mauraza, I had tested to fix the file permission warning, but the postgres pod still cannot connect and restart error is still the same.
Besides, it seems that defaultMode
in volume.secret
has some errors with securityContext.fsGroup
, so I have to remove fsGroup
first and then set volume defaultMode.
Hi @FawenYo,
Could you share the values are you using to deploy postgresql-ha? I don't understand what are you using extra-volume when this exists: https://github.com/bitnami/charts/blob/66cc6d9e9f2daf8cb3c8cd7dbd420ed0ffe2f907/bitnami/postgresql-ha/templates/postgresql/statefulset.yaml#L334-L338
Hi @Mauraza , you can reproduce the error by
kubectl create secret generic repgmr-password --from-literal=.pgpass="*:*:*:repmgr:{PASSWORD}"
postgresql:
password: {PASSWORD}
extraVolumes:
- name: repmgr-passfile
secret:
secretName: "repgmr-password"
items:
- key: ".pgpass"
path: ".pgpass"
defaultMode: 384
extraVolumeMounts:
- name: "repmgr-passfile"
mountPath: "/opt/bitnami/repmgr/secrets"
repmgrUsePassfile: true
repmgrPassfilePath: "/opt/bitnami/repmgr/secrets/.pgpass"
pgpool:
adminPassword: {PASSWORD}
The reason to use extraVolumeMounts
is that I need to first mount the Kubernetes secret via extraVolumes
then mount it with extraVolumeMounts
then set repmgrPassfilePath
, or if you have successfully set repmgr password with file on your environment, please feel free to provide your values, thanks.
Hi @FawenYo,
Could you change the location of the file? I think you are overwritten the REPMGR_PASSWORD_FILE
Hi @Mauraza
I tried the values with
postgresql:
extraVolumeMounts:
- name: "repmgr-passfile"
mountPath: "/opt/bitnami/repmgr/conf"
repmgrPassfilePath: "/opt/bitnami/repmgr/conf/.pgpass"
(Here I only show the modified values from my previous provided values.yaml
)
But it would still pop up the error log in the pod
postgresql-repmgr 01:23:14.12 INFO ==> Preparing repmgr configuration...
/opt/bitnami/scripts/librepmgr.sh: line 489: /opt/bitnami/repmgr/conf/repmgr.conf.tmp: Read-only file system
And if I add subPath
to extraVolumeMounts
with subPath: ".pgpass"
, the pod cannot even start with the error message
Error: failed to start container "postgresql": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/var/lib/kubelet/pods/930abc01-d14e-4d4b-9e06-74f93be0b0e2/volume-subpaths/repmgr-passfile/postgresql/3" to rootfs at "/opt/bitnami/repmgr/conf": mount /var/lib/kubelet/pods/930abc01-d14e-4d4b-9e06-74f93be0b0e2/volume-subpaths/repmgr-passfile/postgresql/3:/opt/bitnami/repmgr/conf (via /proc/self/fd/6), flags: 0x5001: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
I think you can try to reproduce the error with my previously provided steps https://github.com/bitnami/charts/issues/16900#issuecomment-1569357205 in your environment, that can make us address the bug more efficiently, thanks.
Hi @FawenYo
I created a task to try the find a solution. We will update the thread when we have more information.
Hi @Mauraza, any updates? Here I also tried with the below values file
postgresql:
password: {PASSWORD}
podSecurityContext:
enabled: false
extraVolumes:
- name: repmgr-passfile
secret:
secretName: "repgmr-password"
items:
- key: ".pgpass"
path: ".pgpass"
defaultMode: 384
extraVolumeMounts:
- name: "repmgr-passfile"
mountPath: "/etc/secrets"
repmgrPassword: {PASSWORD}
repmgrUsePassfile: true
repmgrPassfilePath: "/etc/secrets/.pgpass"
repmgrLogLevel: DEBUG
pgHbaTrustAll: true
pgpool:
adminPassword: {PASSWORD}
and I got the following error message
postgresql-repmgr 01:37:37.71 INFO ==> ** Starting repmgrd **
[2023-06-16 01:37:37] [NOTICE] repmgrd (repmgrd 5.3.3) starting up
[2023-06-16 01:37:37] [INFO] connecting to database "user=repmgr passfile=/etc/secrets/.pgpass host=postgres-postgresql-ha-postgresql-0.postgres-postgresql-ha-postgresql-headless.test.svc.cluster.local dbname=repmgr port=5432 connect_timeout=5"
[2023-06-16 01:37:37] [DEBUG] connecting to: "user=repmgr passfile=/etc/secrets/.pgpass connect_timeout=5 dbname=repmgr host=postgres-postgresql-ha-postgresql-0.postgres-postgresql-ha-postgresql-headless.test.svc.cluster.local port=5432 fallback_application_name=repmgr options=-csearch_path="
[2023-06-16 01:37:37] [ERROR] repmgr extension not found on this node
[2023-06-16 01:37:37] [DETAIL] repmgr extension is available but not installed in database "repmgr"
[2023-06-16 01:37:37] [HINT] check that this node is part of a repmgr cluster
Although it still has some errors, I think the connection error is now solved right...?
Hi @FawenYo
No error seems to appear, is it working as expected? There is a task to investigate this error, when we have more information we will update the thread.
Hi @FawenYo
No error seems to appear, is it working as expected? There is a task to investigate this error, when we have more information we will update the thread.
Hi @Mauraza , the log shows
[2023-06-16 01:37:37] [ERROR] repmgr extension not found on this node
so it still has some errors
Hi @FawenYo ,
I was not able to reproduce the issue using your latest values in my cluster:
$ k logs postgresql-ha-postgresql-0
postgresql-repmgr 07:43:17.20 INFO ==>
postgresql-repmgr 07:43:17.20 INFO ==> Welcome to the Bitnami postgresql-repmgr container
postgresql-repmgr 07:43:17.20 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
postgresql-repmgr 07:43:17.21 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
postgresql-repmgr 07:43:17.21 INFO ==>
postgresql-repmgr 07:43:17.22 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
postgresql-repmgr 07:43:17.24 INFO ==> Validating settings in REPMGR_* env vars...
postgresql-repmgr 07:43:17.24 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql-repmgr 07:43:17.25 INFO ==> Querying all partner nodes for common upstream node...
postgresql-repmgr 07:43:17.28 INFO ==> There are no nodes with primary role. Assuming the primary role...
postgresql-repmgr 07:43:17.29 INFO ==> Preparing PostgreSQL configuration...
postgresql-repmgr 07:43:17.29 INFO ==> postgresql.conf file not detected. Generating it...
postgresql-repmgr 07:43:17.39 INFO ==> Preparing repmgr configuration...
postgresql-repmgr 07:43:17.40 INFO ==> Initializing Repmgr...
postgresql-repmgr 07:43:17.41 INFO ==> Initializing PostgreSQL database...
postgresql-repmgr 07:43:17.41 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
postgresql-repmgr 07:43:17.41 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
postgresql-repmgr 07:43:18.04 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 07:43:18.16 INFO ==> Changing password of postgres
postgresql-repmgr 07:43:18.18 INFO ==> Creating replication user repmgr
postgresql-repmgr 07:43:18.19 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 07:43:18.32 INFO ==> Configuring replication parameters
postgresql-repmgr 07:43:18.34 INFO ==> Configuring fsync
postgresql-repmgr 07:43:18.35 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 07:43:18.47 INFO ==> Creating repmgr user: repmgr
postgresql-repmgr 07:43:18.50 INFO ==> Creating repmgr database: repmgr
postgresql-repmgr 07:43:18.54 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 07:43:18.64 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 07:43:18.77 INFO ==> Registering Primary...
postgresql-repmgr 07:43:18.82 INFO ==> Loading custom scripts...
postgresql-repmgr 07:43:18.83 INFO ==> Configuring synchronous_replication
postgresql-repmgr 07:43:18.84 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 07:43:18.95 INFO ==> ** PostgreSQL with Replication Manager setup finished! **
postgresql-repmgr 07:43:18.96 INFO ==> Starting PostgreSQL in background...
waiting for server to start....2023-11-20 07:43:18.985 GMT [290] LOG: pgaudit extension initialized
2023-11-20 07:43:18.991 GMT [290] LOG: redirecting log output to logging collector process
2023-11-20 07:43:18.991 GMT [290] HINT: Future log output will appear in directory "/opt/bitnami/postgresql/logs".
2023-11-20 07:43:18.991 GMT [290] LOG: starting PostgreSQL 16.1 on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2023-11-20 07:43:18.991 GMT [290] LOG: listening on IPv4 address "0.0.0.0", port 5432
2023-11-20 07:43:18.991 GMT [290] LOG: listening on IPv6 address "::", port 5432
2023-11-20 07:43:18.992 GMT [290] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2023-11-20 07:43:18.996 GMT [294] LOG: database system was shut down at 2023-11-20 07:43:18 GMT
2023-11-20 07:43:18.999 GMT [290] LOG: database system is ready to accept connections
done
server started
postgresql-repmgr 07:43:19.09 INFO ==> ** Starting repmgrd **
[2023-11-20 07:43:19] [NOTICE] repmgrd (repmgrd 5.3.3) starting up
[2023-11-20 07:43:19] [INFO] connecting to database "user=repmgr passfile=/etc/secrets/.pgpass host=postgresql-ha-postgresql-0.postgresql-ha-postgresql-headless.default.svc.cluster.local dbname=repmgr port=5432 connect_timeout=5"
[2023-11-20 07:43:19] [DEBUG] connecting to: "user=repmgr passfile=/etc/secrets/.pgpass connect_timeout=5 dbname=repmgr host=postgresql-ha-postgresql-0.postgresql-ha-postgresql-headless.default.svc.cluster.local port=5432 fallback_application_name=repmgr options=-csearch_path="
[2023-11-20 07:43:19] [DEBUG] node id is 1000, upstream node id is -1
INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid
[2023-11-20 07:43:19] [NOTICE] starting monitoring of node "postgresql-ha-postgresql-0" (ID: 1000)
[2023-11-20 07:43:19] [INFO] "connection_check_type" set to "ping"
[2023-11-20 07:43:19] [INFO] executing notification command for event "repmgrd_start"
[2023-11-20 07:43:19] [DETAIL] command is:
/opt/bitnami/repmgr/events/router.sh 1000 repmgrd_start 1 "2023-11-20 07:43:19.107857+00" "monitoring cluster primary \"postgresql-ha-postgresql-0\" (ID: 1000)"
[2023-11-20 07:43:19] [NOTICE] monitoring cluster primary "postgresql-ha-postgresql-0" (ID: 1000)
2023-11-20 07:43:49.533 GMT [292] LOG: checkpoint starting: immediate force wait
2023-11-20 07:43:49.568 GMT [292] LOG: checkpoint complete: wrote 52 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.002 s, total=0.035 s; sync files=17, longest=0.002 s, average=0.001 s; distance=16384 kB, estimate=16384 kB; lsn=0/5000060, redo lsn=0/5000028
2023-11-20 07:43:49.965 GMT [292] LOG: checkpoint starting: immediate force wait
2023-11-20 07:43:50.003 GMT [292] LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.001 s, total=0.038 s; sync files=2, longest=0.001 s, average=0.001 s; distance=32768 kB, estimate=32768 kB; lsn=0/7000060, redo lsn=0/7000028
[2023-11-20 07:43:55] [DEBUG] child node: 1002; attached: yes
[2023-11-20 07:43:55] [DEBUG] child node: 1001; attached: yes
[2023-11-20 07:43:55] [NOTICE] new standby "postgresql-ha-postgresql-2" (ID: 1002) has connected
[2023-11-20 07:43:55] [INFO] executing notification command for event "child_node_new_connect"
[2023-11-20 07:43:55] [DETAIL] command is:
/opt/bitnami/repmgr/events/router.sh 1000 child_node_new_connect 1 "2023-11-20 07:43:55.379453+00" "new standby \"postgresql-ha-postgresql-2\" (ID: 1002) has connected"
[2023-11-20 07:43:55] [NOTICE] new standby "postgresql-ha-postgresql-1" (ID: 1001) has connected
[2023-11-20 07:43:55] [INFO] executing notification command for event "child_node_new_connect"
[2023-11-20 07:43:55] [DETAIL] command is:
/opt/bitnami/repmgr/events/router.sh 1000 child_node_new_connect 1 "2023-11-20 07:43:55.406327+00" "new standby \"postgresql-ha-postgresql-1\" (ID: 1001) has connected"
[2023-11-20 07:44:01] [DEBUG] child node: 1002; attached: yes
[2023-11-20 07:44:01] [DEBUG] child node: 1001; attached: yes
Also, previous values with pgHbaTrustAll: true
also worked for me:
postgresql:
password: {PASSWORD}
extraVolumes:
- name: repmgr-passfile
secret:
secretName: "repgmr-password"
items:
- key: ".pgpass"
path: ".pgpass"
defaultMode: 384
extraVolumeMounts:
- name: "repmgr-passfile"
mountPath: "/opt/bitnami/repmgr/secrets"
repmgrUsePassfile: true
repmgrPassfilePath: "/opt/bitnami/repmgr/secrets/.pgpass"
pgHbaTrustAll: true
pgpool:
adminPassword: {PASSWORD}
Probably, the error may be related to this previous case.
I hope it helps
Name and Version
bitnami/postgresql-ha 9.0.13
What architecture are you using?
amd64
What steps will reproduce the bug?
*:*:*:repmgr:{REPMGR_PASSWORD}
in Kubernetes secretAre you using any custom parameters or values?
What do you see instead?
Additional information
Although the pod log shows
no password supplied
, but when I entered the container and cat/opt/bitnami/repmgr/secrets/.pgpass
file, I did see the content of the file and everything see okay. Also,NOTES.txt
in the helm template also requires settingpostgresql.repmgrPassword
values with the codeBut I think if we set the connection info via passfile, we don't really need this.