apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.08k stars 170 forks source link

[BUG]pg upgrade hang #7669

Closed ahjing99 closed 3 months ago

ahjing99 commented 3 months ago

➜ ~ kbcli version Kubernetes: v1.29.4-gke.1043002 KubeBlocks: 0.9.0-beta.41 kbcli: 0.9.0-beta.1

  1. Create cluster and do some ops
    
    `kbcli cluster create  postgres-sdlscz --termination-policy=Delete --cluster-definition=postgresql --enable-all-logs=false --cluster-version=postgresql-12.14.1 --set cpu=100m,memory=0.5Gi,replicas=2,storage=3Gi  --namespace default `

Cluster postgres-sdlscz created

  `kbcli cluster expose postgres-sdlscz --auto-approve --force=true                  --type internet --enable true                 --components postgresql  --namespace default `

OpsRequest postgres-sdlscz-expose-hpl6x created successfully, you can view the progress: kbcli cluster describe-ops postgres-sdlscz-expose-hpl6x -n default

  `kbcli cluster vscale postgres-sdlscz --auto-approve --force=true                 --components postgresql                 --cpu 200m                 --memory 0.6Gi --namespace default `

OpsRequest postgres-sdlscz-verticalscaling-xhxk6 created successfully, you can view the progress: kbcli cluster describe-ops postgres-sdlscz-verticalscaling-xhxk6 -n default

2. upgrade and the ops is always running without any progress

kbcli cluster upgrade postgres-sdlscz --auto-approve --force=true --cluster-version=postgresql-12.15.0 --namespace default

OpsRequest postgres-sdlscz-upgrade-r4sfb created successfully, you can view the progress: kbcli cluster describe-ops postgres-sdlscz-upgrade-r4sfb -n default

➜ ~ kbcli cluster describe-ops postgres-sdlscz-upgrade-r4sfb -n default Spec: Name: postgres-sdlscz-upgrade-r4sfb NameSpace: default Cluster: postgres-sdlscz Type: Upgrade

Command: kbcli cluster upgrade postgres-sdlscz --cluster-version=0x14001289350 --namespace=default

Last Configuration: Cluster Version: postgresql-12.14.1

Status: Start Time: Jun 28,2024 14:58 UTC+0800 Duration: 9m8s Status: Running Progress: 0/2 OBJECT-KEY STATUS DURATION MESSAGE Pod/postgres-sdlscz-postgresql-1 Pending Pod/postgres-sdlscz-postgresql-0 Pending

Conditions: LAST-TRANSITION-TIME TYPE REASON STATUS MESSAGE Jun 28,2024 14:58 UTC+0800 WaitForProgressing WaitForProgressing True wait for the controller to process the OpsRequest: postgres-sdlscz-upgrade-r4sfb in Cluster: postgres-sdlscz Jun 28,2024 14:58 UTC+0800 Validated ValidateOpsRequestPassed True OpsRequest: postgres-sdlscz-upgrade-r4sfb is validated Jun 28,2024 14:58 UTC+0800 HorizontalScaling HorizontalScalingStarted True Start to horizontal scale replicas in Cluster: postgres-sdlscz

Warning Events:

➜ ~ k get pod | grep postgres-sdlscz postgres-sdlscz-postgresql-0 4/4 Running 1 (10m ago) 13m postgres-sdlscz-postgresql-1 4/4 Running 1 (10m ago) 11m ➜ ~ k logs postgres-sdlscz-postgresql-0 Defaulted container "postgresql" out of: postgresql, pgbouncer, lorry, config-manager, pg-init-container (init) 2024-06-28 06:59:32,567 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?) 2024-06-28 06:59:32,578 - bootstrapping - INFO - Looks like you are running google 2024-06-28 06:59:32,682 - bootstrapping - INFO - kubeblocks generate local configuration: bootstrap: dcs: check_timeline: true loop_wait: 10 max_timelines_history: 0 maximum_lag_on_failover: 1048576 postgresql: parameters: archive_command: /bin/true archive_mode: 'on' autovacuum_analyze_scale_factor: '0.1' autovacuum_max_workers: '3' autovacuum_vacuum_scale_factor: '0.05' checkpoint_completion_target: '0.9' log_autovacuum_min_duration: '10000' log_checkpoints: 'True' log_connections: 'False' log_disconnections: 'False' log_min_duration_statement: '1000' log_statement: ddl log_temp_files: 128kB max_connections: '56' max_locks_per_transaction: '64' max_prepared_transactions: '100' max_replication_slots: '16' max_wal_senders: '64' max_worker_processes: '8' tcp_keepalives_idle: 45s tcp_keepalives_interval: 10s track_commit_timestamp: 'False' track_functions: pl wal_compression: 'True' wal_keep_segments: '0' wal_level: replica wal_log_hints: 'False' retry_timeout: 10 ttl: 30 initdb:

2024-06-28 06:59:32,775 - bootstrapping - INFO - Configuring pgqd 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring wal-e 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring pam-oauth2 2024-06-28 06:59:32,776 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring bootstrap 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring pgbouncer 2024-06-28 06:59:32,776 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring standby-cluster 2024-06-28 06:59:32,776 - bootstrapping - INFO - Configuring patroni 2024-06-28 06:59:32,868 - bootstrapping - INFO - Writing to file /run/postgres.yml 2024-06-28 06:59:32,868 - bootstrapping - INFO - Configuring crontab 2024-06-28 06:59:32,869 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability 2024-06-28 06:59:32,869 - bootstrapping - INFO - Configuring log 2024-06-28 06:59:32,869 - bootstrapping - INFO - Configuring certificate 2024-06-28 06:59:32,869 - bootstrapping - INFO - Generating ssl self-signed certificate 2024-06-28 06:59:36,180 INFO: Selected new K8s API server endpoint https://10.128.0.17:443 2024-06-28 06:59:36,287 INFO: No PostgreSQL configuration items changed, nothing to reload. 2024-06-28 06:59:36,365 WARNING: Postgresql is not running. 2024-06-28 06:59:36,366 INFO: Lock owner: None; I am postgres-sdlscz-postgresql-0 2024-06-28 06:59:36,368 INFO: pg_controldata: pg_control version number: 1201 Catalog version number: 201909212 Database system identifier: 7385443084509761647 Database cluster state: shut down pg_control last modified: Fri Jun 28 06:58:23 2024 Latest checkpoint location: 0/5000028 Latest checkpoint's REDO location: 0/5000028 Latest checkpoint's REDO WAL file: 000000020000000000000005 Latest checkpoint's TimeLineID: 2 Latest checkpoint's PrevTimeLineID: 2 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:944 Latest checkpoint's NextOID: 16825 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 480 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid: 0 Latest checkpoint's newestCommitTsXid: 0 Time of latest checkpoint: Fri Jun 28 06:58:23 2024 Fake LSN counter for unlogged rels: 0/3E8 Minimum recovery ending location: 0/0 Min recovery ending loc's timeline: 0 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: replica wal_log_hints setting: on max_connections setting: 56 max_worker_processes setting: 8 max_wal_senders setting: 64 max_prepared_xacts setting: 100 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Mock authentication nonce: 5bfef3f847b97d7772d89f269a11f9174ca8d4acd586b7d3cbf310f021db65c0

2024-06-28 06:59:36,376 INFO: Lock owner: None; I am postgres-sdlscz-postgresql-0 2024-06-28 06:59:36,502 INFO: starting as a secondary 2024-06-28 06:59:37 GMT [113]: [1-1] 667e5f59.71 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter... 2024-06-28 06:59:37 GMT [113]: [2-1] 667e5f59.71 0 LOG: pg_stat_kcache.linux_hz is set to 500000 2024-06-28 06:59:37 GMT [113]: [3-1] 667e5f59.71 0 LOG: pgaudit extension initialized 2024-06-28 06:59:37 GMT [113]: [4-1] 667e5f59.71 0 LOG: starting PostgreSQL 12.18 (Ubuntu 12.18-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit 2024-06-28 06:59:37 GMT [113]: [5-1] 667e5f59.71 0 LOG: listening on IPv4 address "0.0.0.0", port 5432 2024-06-28 06:59:37 GMT [113]: [6-1] 667e5f59.71 0 LOG: listening on IPv6 address "::", port 5432 2024-06-28 06:59:37 GMT [113]: [7-1] 667e5f59.71 0 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-06-28 06:59:38,167 INFO: postmaster pid=113 2024-06-28 06:59:38 GMT [113]: [8-1] 667e5f59.71 0 LOG: redirecting log output to logging collector process 2024-06-28 06:59:38 GMT [113]: [9-1] 667e5f59.71 0 HINT: Future log output will appear in directory "../pg_log". /var/run/postgresql:5432 - rejecting connections /var/run/postgresql:5432 - accepting connections 2024-06-28 06:59:38,573 INFO: establishing a new patroni connection to the postgres cluster 2024-06-28 06:59:38,770 WARNING: Request failed to postgres-sdlscz-postgresql-1: GET http://10.28.2.245:8008/patroni (HTTPConnectionPool(host='10.28.2.245', port=8008): Max retries exceeded with url: /patroni (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7805604a9d20>: Failed to establish a new connection: [Errno 111] Connection refused'))) 2024-06-28 06:59:38,866 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'" 2024-06-28 06:59:39,271 INFO: promoted self to leader by acquiring session lock server promoting 2024-06-28 06:59:39,366 INFO: cleared rewind state after becoming the leader 2024-06-28 06:59:39,273 INFO: Lock owner: postgres-sdlscz-postgresql-0; I am postgres-sdlscz-postgresql-0 2024-06-28 06:59:40,468 INFO: updated leader lock during promote 2024-06-28 06:59:42,070 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock DO NOTICE: role "admin" is already a member of role "cron_admin" GRANT ROLE DO DO NOTICE: extension "pg_auth_mon" already exists, skipping CREATE EXTENSION NOTICE: version "1.1" of extension "pg_auth_mon" is already installed ALTER EXTENSION GRANT NOTICE: extension "pg_cron" already exists, skipping CREATE EXTENSION DO NOTICE: version "1.5" of extension "pg_cron" is already installed ALTER EXTENSION ALTER POLICY REVOKE GRANT REVOKE GRANT ALTER POLICY REVOKE GRANT CREATE FUNCTION REVOKE GRANT REVOKE GRANT NOTICE: extension "file_fdw" already exists, skipping CREATE EXTENSION DO NOTICE: relation "postgres_log" already exists, skipping CREATE TABLE GRANT NOTICE: relation "postgres_log_0" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_1" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_2" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_3" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_4" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_5" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_6" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_7" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT RESET SET NOTICE: drop cascades to 5 other objects DETAIL: drop cascades to type zmon_utils.system_information drop cascades to function zmon_utils.get_database_cluster_information() drop cascades to function zmon_utils.get_database_cluster_system_information() drop cascades to function zmon_utils.get_last_status_active_cronjobs() drop cascades to view zmon_utils.last_status_active_cronjobs DROP SCHEMA NOTICE: extension "plpython3u" already exists, skipping DO NOTICE: language "plpythonu" does not exist, skipping DROP LANGUAGE NOTICE: function plpython_call_handler() does not exist, skipping DROP FUNCTION NOTICE: function plpython_inline_handler(internal) does not exist, skipping DROP FUNCTION NOTICE: function plpython_validator(oid) does not exist, skipping DROP FUNCTION CREATE SCHEMA GRANT SET CREATE TYPE CREATE FUNCTION CREATE FUNCTION CREATE FUNCTION REVOKE CREATE VIEW REVOKE GRANT GRANT You are now connected to database "postgres" as user "postgres". NOTICE: schema "user_management" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT GRANT RESET NOTICE: extension "pg_stat_statements" already exists, skipping CREATE EXTENSION NOTICE: extension "pg_stat_kcache" already exists, skipping CREATE EXTENSION NOTICE: extension "set_user" already exists, skipping CREATE EXTENSION NOTICE: version "3.0" of extension "set_user" is already installed ALTER EXTENSION GRANT GRANT GRANT NOTICE: schema "metric_helpers" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW REVOKE GRANT REVOKE GRANT RESET You are now connected to database "template1" as user "postgres". NOTICE: schema "user_management" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT GRANT RESET NOTICE: extension "pg_stat_statements" already exists, skipping CREATE EXTENSION NOTICE: extension "pg_stat_kcache" already exists, skipping CREATE EXTENSION NOTICE: extension "set_user" already exists, skipping CREATE EXTENSION NOTICE: version "3.0" of extension "set_user" is already installed ALTER EXTENSION GRANT GRANT GRANT NOTICE: schema "metric_helpers" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW REVOKE GRANT REVOKE GRANT RESET 2024-06-28 06:59:51,214 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:01,217 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:11,207 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:21,087 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:31,222 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:34.994 UTC [82] LOG Starting pgqd 3.5 2024-06-28 07:00:34.995 UTC [82] LOG auto-detecting dbs ... 2024-06-28 07:00:41,092 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:00:51,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:01,088 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:05.003 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:01:11,139 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:21,085 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:31,091 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:35.033 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:01:41,111 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:01:51,085 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:01,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:05.051 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:02:11,090 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:21,091 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:31,095 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:35.055 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:02:41,098 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:02:51,133 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:01,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:05.082 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:03:11,187 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:21,785 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:31,082 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:35.109 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:03:41,400 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:03:51,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:01,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:05.121 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:04:11,314 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:21,107 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:31,088 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:35.139 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:04:41,123 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:04:51,088 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:01,098 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:05.165 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:05:11,143 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:21,084 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:31,088 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:35.174 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:05:41,092 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:05:51,194 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:01,086 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:05.201 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:06:11,087 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:21,091 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:31,086 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:35.203 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:06:41,096 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:06:51,090 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:01,086 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:05.213 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:07:11,091 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:21,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:31,088 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:35.227 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:07:41,096 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:07:51,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:01,100 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:05.250 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:08:11,087 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:21,089 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:31,092 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:35.262 UTC [82] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 07:08:41,202 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:08:51,084 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 07:09:01,110 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock ➜ ~ k logs postgres-sdlscz-postgresql-0 --previous Defaulted container "postgresql" out of: postgresql, pgbouncer, lorry, config-manager, pg-init-container (init) 2024-06-28 06:56:25,070 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?) 2024-06-28 06:56:25,079 - bootstrapping - INFO - Looks like you are running google 2024-06-28 06:56:25,182 - bootstrapping - INFO - kubeblocks generate local configuration: bootstrap: dcs: check_timeline: true loop_wait: 10 max_timelines_history: 0 maximum_lag_on_failover: 1048576 postgresql: parameters: archive_command: /bin/true archive_mode: 'on' autovacuum_analyze_scale_factor: '0.1' autovacuum_max_workers: '3' autovacuum_vacuum_scale_factor: '0.05' checkpoint_completion_target: '0.9' log_autovacuum_min_duration: '10000' log_checkpoints: 'True' log_connections: 'False' log_disconnections: 'False' log_min_duration_statement: '1000' log_statement: ddl log_temp_files: 128kB max_connections: '56' max_locks_per_transaction: '64' max_prepared_transactions: '100' max_replication_slots: '16' max_wal_senders: '64' max_worker_processes: '8' tcp_keepalives_idle: 45s tcp_keepalives_interval: 10s track_commit_timestamp: 'False' track_functions: pl wal_compression: 'True' wal_keep_segments: '0' wal_level: replica wal_log_hints: 'False' retry_timeout: 10 ttl: 30 initdb:

2024-06-28 06:56:25,275 - bootstrapping - INFO - Configuring pgqd 2024-06-28 06:56:25,276 - bootstrapping - INFO - Configuring crontab 2024-06-28 06:56:25,276 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability 2024-06-28 06:56:25,276 - bootstrapping - INFO - Configuring bootstrap 2024-06-28 06:56:25,277 - bootstrapping - INFO - Configuring pam-oauth2 2024-06-28 06:56:25,277 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping 2024-06-28 06:56:25,277 - bootstrapping - INFO - Configuring standby-cluster 2024-06-28 06:56:25,277 - bootstrapping - INFO - Configuring certificate 2024-06-28 06:56:25,277 - bootstrapping - INFO - Generating ssl self-signed certificate 2024-06-28 06:56:26,178 - bootstrapping - INFO - Configuring pgbouncer 2024-06-28 06:56:26,178 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping 2024-06-28 06:56:26,178 - bootstrapping - INFO - Configuring patroni 2024-06-28 06:56:26,268 - bootstrapping - INFO - Writing to file /run/postgres.yml 2024-06-28 06:56:26,269 - bootstrapping - INFO - Configuring log 2024-06-28 06:56:26,269 - bootstrapping - INFO - Configuring wal-e 2024-06-28 06:56:28,690 INFO: Selected new K8s API server endpoint https://10.128.0.17:443 2024-06-28 06:56:28,872 INFO: No PostgreSQL configuration items changed, nothing to reload. 2024-06-28 06:56:28,879 WARNING: Postgresql is not running. 2024-06-28 06:56:28,880 INFO: Lock owner: postgres-sdlscz-postgresql-1; I am postgres-sdlscz-postgresql-0 2024-06-28 06:56:28,881 INFO: pg_controldata: pg_control version number: 1201 Catalog version number: 201909212 Database system identifier: 7385443084509761647 Database cluster state: shut down in recovery pg_control last modified: Fri Jun 28 06:55:16 2024 Latest checkpoint location: 0/20053C8 Latest checkpoint's REDO location: 0/2000028 Latest checkpoint's REDO WAL file: 000000010000000000000002 Latest checkpoint's TimeLineID: 1 Latest checkpoint's PrevTimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0:675 Latest checkpoint's NextOID: 24576 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 480 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 675 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Latest checkpoint's oldestCommitTsXid: 0 Latest checkpoint's newestCommitTsXid: 0 Time of latest checkpoint: Fri Jun 28 06:51:36 2024 Fake LSN counter for unlogged rels: 0/3E8 Minimum recovery ending location: 0/304A398 Min recovery ending loc's timeline: 1 Backup start location: 0/0 Backup end location: 0/0 End-of-backup record required: no wal_level setting: replica wal_log_hints setting: on max_connections setting: 56 max_worker_processes setting: 8 max_wal_senders setting: 64 max_prepared_xacts setting: 100 max_locks_per_xact setting: 64 track_commit_timestamp setting: off Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Mock authentication nonce: 5bfef3f847b97d7772d89f269a11f9174ca8d4acd586b7d3cbf310f021db65c0

2024-06-28 06:56:28,882 INFO: Lock owner: postgres-sdlscz-postgresql-1; I am postgres-sdlscz-postgresql-0 2024-06-28 06:56:28,973 INFO: Local timeline=1 lsn=0/304A398 2024-06-28 06:56:29,138 INFO: primary_timeline=1 2024-06-28 06:56:29,138 INFO: Lock owner: postgres-sdlscz-postgresql-1; I am postgres-sdlscz-postgresql-0 2024-06-28 06:56:29,452 INFO: starting as a secondary 2024-06-28 06:56:30 GMT [104]: [1-1] 667e5e9e.68 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter... 2024-06-28 06:56:30 GMT [104]: [2-1] 667e5e9e.68 0 LOG: pg_stat_kcache.linux_hz is set to 500000 2024-06-28 06:56:30 GMT [104]: [3-1] 667e5e9e.68 0 LOG: pgaudit extension initialized 2024-06-28 06:56:30 GMT [104]: [4-1] 667e5e9e.68 0 LOG: starting PostgreSQL 12.18 (Ubuntu 12.18-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit 2024-06-28 06:56:30 GMT [104]: [5-1] 667e5e9e.68 0 LOG: listening on IPv4 address "0.0.0.0", port 5432 2024-06-28 06:56:30 GMT [104]: [6-1] 667e5e9e.68 0 LOG: listening on IPv6 address "::", port 5432 2024-06-28 06:56:30,876 INFO: postmaster pid=104 2024-06-28 06:56:30 GMT [104]: [7-1] 667e5e9e.68 0 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2024-06-28 06:56:31 GMT [104]: [8-1] 667e5e9e.68 0 LOG: redirecting log output to logging collector process 2024-06-28 06:56:31 GMT [104]: [9-1] 667e5e9e.68 0 HINT: Future log output will appear in directory "../pg_log". /var/run/postgresql:5432 - rejecting connections /var/run/postgresql:5432 - rejecting connections /var/run/postgresql:5432 - accepting connections 2024-06-28 06:56:32,379 INFO: Lock owner: postgres-sdlscz-postgresql-1; I am postgres-sdlscz-postgresql-0 2024-06-28 06:56:32,379 INFO: establishing a new patroni connection to the postgres cluster 2024-06-28 06:56:32,671 INFO: no action. I am (postgres-sdlscz-postgresql-0), a secondary, and following a leader (postgres-sdlscz-postgresql-1) 2024-06-28 06:56:34,340 WARNING: Request failed to postgres-sdlscz-postgresql-1: GET http://10.28.2.244:8008/patroni (HTTPConnectionPool(host='10.28.2.244', port=8008): Max retries exceeded with url: /patroni (Caused by ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))) 2024-06-28 06:56:34,352 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'" 2024-06-28 06:56:34,567 INFO: promoted self to leader by acquiring session lock server promoting 2024-06-28 06:56:34,578 INFO: cleared rewind state after becoming the leader 2024-06-28 06:56:34,570 INFO: Lock owner: postgres-sdlscz-postgresql-0; I am postgres-sdlscz-postgresql-0 2024-06-28 06:56:34,870 INFO: updated leader lock during promote 2024-06-28 06:56:36,668 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:56:37,265 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock DO NOTICE: role "admin" is already a member of role "cron_admin" GRANT ROLE DO DO NOTICE: extension "pg_auth_mon" already exists, skipping CREATE EXTENSION NOTICE: version "1.1" of extension "pg_auth_mon" is already installed ALTER EXTENSION GRANT NOTICE: extension "pg_cron" already exists, skipping CREATE EXTENSION DO NOTICE: version "1.5" of extension "pg_cron" is already installed ALTER EXTENSION ALTER POLICY REVOKE GRANT REVOKE GRANT ALTER POLICY REVOKE GRANT CREATE FUNCTION REVOKE GRANT REVOKE GRANT NOTICE: extension "file_fdw" already exists, skipping CREATE EXTENSION DO NOTICE: relation "postgres_log" already exists, skipping CREATE TABLE GRANT NOTICE: relation "postgres_log_0" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_1" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_2" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_3" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_4" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_5" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_6" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT NOTICE: relation "postgres_log_7" already exists, skipping CREATE FOREIGN TABLE GRANT CREATE VIEW ALTER VIEW GRANT RESET SET NOTICE: drop cascades to 5 other objects DETAIL: drop cascades to type zmon_utils.system_information drop cascades to function zmon_utils.get_database_cluster_information() drop cascades to function zmon_utils.get_database_cluster_system_information() drop cascades to function zmon_utils.get_last_status_active_cronjobs() drop cascades to view zmon_utils.last_status_active_cronjobs DROP SCHEMA NOTICE: extension "plpython3u" already exists, skipping DO NOTICE: language "plpythonu" does not exist, skipping DROP LANGUAGE NOTICE: function plpython_call_handler() does not exist, skipping DROP FUNCTION NOTICE: function plpython_inline_handler(internal) does not exist, skipping DROP FUNCTION NOTICE: function plpython_validator(oid) does not exist, skipping DROP FUNCTION CREATE SCHEMA GRANT SET CREATE TYPE CREATE FUNCTION CREATE FUNCTION CREATE FUNCTION REVOKE CREATE VIEW REVOKE GRANT GRANT You are now connected to database "postgres" as user "postgres". NOTICE: schema "user_management" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT GRANT RESET NOTICE: extension "pg_stat_statements" already exists, skipping CREATE EXTENSION NOTICE: extension "pg_stat_kcache" already exists, skipping CREATE EXTENSION NOTICE: extension "set_user" already exists, skipping CREATE EXTENSION NOTICE: version "3.0" of extension "set_user" is already installed ALTER EXTENSION GRANT GRANT GRANT NOTICE: schema "metric_helpers" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW REVOKE GRANT REVOKE GRANT RESET You are now connected to database "template1" as user "postgres". NOTICE: schema "user_management" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT CREATE FUNCTION REVOKE GRANT COMMENT GRANT RESET NOTICE: extension "pg_stat_statements" already exists, skipping CREATE EXTENSION NOTICE: extension "pg_stat_kcache" already exists, skipping CREATE EXTENSION NOTICE: extension "set_user" already exists, skipping CREATE EXTENSION NOTICE: version "3.0" of extension "set_user" is already installed ALTER EXTENSION GRANT GRANT GRANT NOTICE: schema "metric_helpers" already exists, skipping CREATE SCHEMA GRANT SET CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW CREATE FUNCTION CREATE VIEW REVOKE GRANT REVOKE GRANT RESET 2024-06-28 06:56:47,251 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:56:57,485 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:07,266 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:17,275 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:27,232 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:27.588 UTC [71] LOG Starting pgqd 3.5 2024-06-28 06:57:27.589 UTC [71] LOG auto-detecting dbs ... 2024-06-28 06:57:37,195 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:47,186 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:57,239 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:57:57.618 UTC [71] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 06:58:07,190 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock 2024-06-28 06:58:17,192 INFO: no action. I am (postgres-sdlscz-postgresql-0), the leader with the lock /etc/runit/runsvdir/default/patroni: finished with code=0 signal=0 stopping /etc/runit/runsvdir/default/patroni 2024-06-28 06:58:27.633 UTC [71] LOG {ticks: 0, maint: 0, retry: 0} 2024-06-28 06:58:27.668 UTC [71] ERROR connection error: PQconnectStart 2024-06-28 06:58:27.668 UTC [71] ERROR libpq: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory Is the server running locally and accepting connections on that socket? 2024-06-28 06:58:27.671 UTC [71] ERROR connection error: PQconnectStart 2024-06-28 06:58:27.671 UTC [71] ERROR libpq: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory Is the server running locally and accepting connections on that socket? 2024-06-28 06:58:27.671 UTC [71] WARNING [postgres]: default timeout timeout: finish: .: (pid 363) 124s, want down ok: down: patroni: 0s, normally up ok: down: /etc/service/patroni: 0s, normally up 2024-06-28 06:58:32.191 UTC [71] LOG Got SIGTERM, fast exit ok: down: /etc/service/pgqd: 0s, normally up

ahjing99 commented 3 months ago

lower the severity since cannot recreate everytime