CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.89k stars 587 forks source link

Primary & Replicas continuously restarting #3626

Closed vaigau6g closed 3 months ago

vaigau6g commented 1 year ago

Please ensure you do the following when reporting a bug:

Overview

Add a concise description of what the bug is.

Environment

Please provide the following details:

Steps to Reproduce

REPRO

Provide steps to get to the error condition:

  1. Run ...
  2. Do ...
  3. Try ...

EXPECTED

  1. Provide the behavior that you expected.

ACTUAL

  1. Describe what actually happens

Logs

Please provided appropriate log output or any configuration files that may help troubleshoot the issue. DO NOT include sensitive information, such as passwords.

Additional Information

Please provide any additional information that may be helpful.

vaigau6g commented 1 year ago

Exception in primary/replicas pods =======

2023-04-09 14:34:01,752 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:48184 2023-04-09 14:34:01,754 WARNING: Traceback (most recent call last): File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock self.process_request(request, client_address) File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request t.start() File "/usr/lib64/python3.6/threading.py", line 867, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread 2023-04-09 14:34:01,755 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:48182 2023-04-09 14:34:01,755 WARNING: Traceback (most recent call last): File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock self.process_request(request, client_address) File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request t.start() File "/usr/lib64/python3.6/threading.py", line 867, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread 2023-04-09 14:34:10,034 INFO: no action. I am (mdsp-psql-lpc-primary-8csb-0), the leader with the lock 2023-04-09 14:34:20,031 INFO: no action. I am (mdsp-psql-lpc-primary-8csb-0), the leader with the lock 2023-04-09 14:34:21,753 WARNING: Exception happened during processing of request from ::ffff:10.128.12.1:59482 2023-04-09 14:34:21,754 WARNING: Traceback (most recent call last): File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock self.process_request(request, client_address) File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request t.start() File "/usr/lib64/python3.6/threading.py", line 867, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread

vaigau6g commented 1 year ago

pg_hba configuration ============================

patronictl show-config loop_wait: 10 postgresql: parameters: archive_command: pgbackrest --stanza=db archive-push "%p" archive_mode: 'on' archive_timeout: 60s default_pool_size: '100' jit: 'off' max_connections: '2500' password_encryption: scram-sha-256 restore_command: pgbackrest --stanza=db archive-get %f "%p" shared_buffers: 4GB shared_preload_libraries: pgaudit ssl: 'on' ssl_ca_file: /pgconf/tls/ca.crt ssl_cert_file: /pgconf/tls/tls.crt ssl_key_file: /pgconf/tls/tls.key synchronous_commit: 'on' unix_socket_directories: /tmp/postgres wal_level: logical pg_hba:

vaigau6g commented 1 year ago

issue is regarding application requests in psql  getting below error on psql pod and restarting continuously

Exception happened during processing of request from ::ffff:10.128.10.1:43576
1192023-04-11 06:58:03,453 WARNING: Traceback (most recent call last):
120File "/usr/lib64/python3.6/socketserver.py", line 320, in _handle_request_noblock
121self.process_request(request, client_address)
122File "/usr/lib64/python3.6/socketserver.py", line 669, in process_request
123t.start()
124File "/usr/lib64/python3.6/threading.py", line 867, in start
125_start_new_thread(self._bootstrap, ())
126RuntimeError: can't start new thread
vaigau6g commented 1 year ago

postgres user credentials working with connection string postgres_credentials_working

vaigau6g commented 1 year ago

Is it due to restoring the older version database to the newer version pgo cluster ? Previous version before upgrade pgo version = 4.7.3 postgresql version = 13

andrewlecuyer commented 3 months ago

Considering this issue is for an older version of CPK that is no longer actively maintained via the Crunchy Developer Program, I am proceeding with closing (see the Supported Plaforms page for additional information about supported versions of CPK).

For information about upgrading from CPK v4 to v5, please see the upgrade guide:

https://access.crunchydata.com/documentation/postgres-operator/latest/upgrade/v4tov5

And if you still require support for CPK v4.7.3, I recommend recaching out to info@crunchydata.com to discuss your requirements/needs further.