canonical / mysql-k8s-operator

A Charmed Operator for running MySQL on Kubernetes
https://charmhub.io/mysql-k8s
Apache License 2.0
8 stars 15 forks source link

Single unit cant recover on node restart #436

Open paulomach opened 3 weeks ago

paulomach commented 3 weeks ago

Steps to reproduce

Restart k8s node where mysql-k8s is running. Unit was properly initialized

Expected behavior

Unit trigger reboot-from-complete-outage and resume operations

Actual behavior

Recover procedure fails.

Versions

charm r127

Log output

2024-06-10T15:41:01.754Z [container-agent] 2024-06-10 15:41:01 INFO juju-log Adding pebble layer
2024-06-10T15:41:34.947Z [container-agent] 2024-06-10 15:41:34 ERROR juju-log Uncaught exception while in charm code:
2024-06-10T15:41:34.947Z [container-agent] Traceback (most recent call last):
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 770, in <module>
2024-06-10T15:41:34.947Z [container-agent]     main(MySQLOperatorCharm)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 456, in main
2024-06-10T15:41:34.947Z [container-agent]     _emit_charm_event(charm, dispatcher.event_name)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 144, in _emit_charm_event
2024-06-10T15:41:34.947Z [container-agent]     event_to_emit.emit(*args, **kwargs)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 351, in emit
2024-06-10T15:41:34.947Z [container-agent]     framework._emit(event)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 853, in _emit
2024-06-10T15:41:34.947Z [container-agent]     self._reemit(event_path)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 943, in _reemit
2024-06-10T15:41:34.947Z [container-agent]     custom_handler(event)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 570, in _on_mysql_pebble_ready
2024-06-10T15:41:34.947Z [container-agent]     self._reconcile_pebble_layer(container)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 339, in _reconcile_pebble_layer
2024-06-10T15:41:34.947Z [container-agent]     self._mysql.wait_until_mysql_connection()
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/tenacity/__init__.py", line 289, in wrapped_f
2024-06-10T15:41:34.947Z [container-agent]     return self(f, *args, **kw)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/tenacity/__init__.py", line 379, in __call__
2024-06-10T15:41:34.947Z [container-agent]     do = self.iter(retry_state=retry_state)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/tenacity/__init__.py", line 325, in iter
2024-06-10T15:41:34.947Z [container-agent]     raise retry_exc.reraise()
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/tenacity/__init__.py", line 158, in reraise
2024-06-10T15:41:34.947Z [container-agent]     raise self.last_attempt.result()
2024-06-10T15:41:34.947Z [container-agent]   File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-06-10T15:41:34.947Z [container-agent]     return self.__get_result()
2024-06-10T15:41:34.947Z [container-agent]   File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-06-10T15:41:34.947Z [container-agent]     raise self._exception
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/tenacity/__init__.py", line 382, in __call__
2024-06-10T15:41:34.947Z [container-agent]     result = fn(*args, **kwargs)
2024-06-10T15:41:34.947Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 229, in wait_until_mysql_connection
2024-06-10T15:41:34.947Z [container-agent]     raise MySQLServiceNotRunningError
2024-06-10T15:41:34.947Z [container-agent] charms.mysql.v0.mysql.MySQLServiceNotRunningError
2024-06-10T15:41:37.872Z [container-agent] 2024-06-10 15:41:37 ERROR juju.worker.uniter.operation runhook.go:180 hook "mysql-pebble-ready" (via hook dispatching script: dispatch) failed: exit status 1
2024-06-10T15:41:37.872Z [container-agent] 2024-06-10 15:41:37 ERROR juju.worker.uniter pebblepoller.go:103 pebble poll failed for container "mysql": failed to send pebble-ready event: hook failed
2024-06-10T15:41:51.631Z [container-agent] 2024-06-10 15:41:51 INFO juju-log Setting up the logrotate configurations
2024-06-10T15:41:53.037Z [container-agent] 2024-06-10 15:41:52 INFO juju-log Adding pebble layer
2024-06-10T15:41:58.278Z [container-agent] 2024-06-10 15:41:58 INFO juju-log Unit workload member-state is offline with member-role unknown
2024-06-10T15:41:58.400Z [container-agent] 2024-06-10 15:41:58 INFO juju-log Attempting reboot from complete outage.
2024-06-10T15:42:07.005Z [container-agent] 2024-06-10 15:42:07 ERROR juju-log Failed to reboot cluster
2024-06-10T15:42:07.005Z [container-agent] Traceback (most recent call last):
2024-06-10T15:42:07.005Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 666, in _run_mysqlsh_script
2024-06-10T15:42:07.005Z [container-agent]     stdout, _ = process.wait_output()
2024-06-10T15:42:07.005Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 1540, in wait_output
2024-06-10T15:42:07.005Z [container-agent]     raise ExecError[AnyStr](self._command, exit_code, out_value, err_value)
2024-06-10T15:42:07.005Z [container-agent] ops.pebble.ExecError: non-zero exit code 1 executing ['/usr/bin/mysqlsh', '--no-wizard', '--python', '--verbose=1', '-f', '/tmp/script.py', ';', 'rm', '/tmp/script.py'], stdout='', stderr='Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory\nverbose: 2024-06-10T15:42:00Z: Loading startup files...\nverbose: 2024-06-10T15:42:00Z: Loading plugins...\nverbose: 2024-06-10T15:42:03Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:42:03Z: Shell.connect: tid=31: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:42:03Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:42:04Z: Dba.reboot_cluster_from_complete_outage: tid=32: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306\nverbose: 2024-06-10T15:42:04Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:42:04Z: Dba.reboot_cluster_from_complete_outage: tid=33: CONNECTED: nova-mysql-0.nova' [truncated]
2024-06-10T15:42:07.005Z [container-agent] 
2024-06-10T15:42:07.005Z [container-agent] During handling of the above exception, another exception occurred:
2024-06-10T15:42:07.005Z [container-agent] 
2024-06-10T15:42:07.005Z [container-agent] Traceback (most recent call last):
2024-06-10T15:42:07.005Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1954, in reboot_from_complete_outage
2024-06-10T15:42:07.005Z [container-agent]     self._run_mysqlsh_script("\n".join(reboot_from_outage_command))
2024-06-10T15:42:07.005Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 669, in _run_mysqlsh_script
2024-06-10T15:42:07.005Z [container-agent]     raise MySQLClientError(e.stderr)
2024-06-10T15:42:07.005Z [container-agent] charms.mysql.v0.mysql.MySQLClientError: Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:00Z: Loading startup files...
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:00Z: Loading plugins...
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:03Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:03Z: Shell.connect: tid=31: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:03Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Dba.reboot_cluster_from_complete_outage: tid=32: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Dba.reboot_cluster_from_complete_outage: tid=33: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Group Replication 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Metadata 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Dba.reboot_cluster_from_complete_outage: tid=34: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:04Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:05Z: Dba.reboot_cluster_from_complete_outage: tid=35: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:42:07.005Z [container-agent] No PRIMARY member found for cluster 'cluster-90686a958d50dbbe04045eddd9b73c47'
2024-06-10T15:42:07.005Z [container-agent] verbose: 2024-06-10T15:42:05Z: ClusterSet info: member, primary, not primary_invalidated, not removed from set, primary status: UNKNOWN
2024-06-10T15:42:07.005Z [container-agent] Restoring the Cluster 'cluster-90686a958d50dbbe04045eddd9b73c47' from complete outage...
2024-06-10T15:42:07.005Z [container-agent] 
2024-06-10T15:42:07.005Z [container-agent] ERROR: RuntimeError: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:42:07.005Z [container-agent] Traceback (most recent call last):
2024-06-10T15:42:07.005Z [container-agent]   File "<string>", line 2, in <module>
2024-06-10T15:42:07.005Z [container-agent] RuntimeError: Dba.reboot_cluster_from_complete_outage: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:42:07.005Z [container-agent] 
2024-06-10T15:42:07.005Z [container-agent] 
2024-06-10T15:42:07.008Z [container-agent] 2024-06-10 15:42:07 ERROR juju-log Failed to reboot cluster from complete outage.
2024-06-10T15:42:08.367Z [container-agent] 2024-06-10 15:42:08 INFO juju-log Promtail binary zip file has been downloaded and stored in: /tmp/promtail-static-amd64.gz
2024-06-10T15:42:21.260Z [container-agent] 2024-06-10 15:42:21 INFO juju.worker.uniter.operation runhook.go:186 ran "mysql-pebble-ready" hook (via hook dispatching script: dispatch)
2024-06-10T15:45:08.923Z [container-agent] 2024-06-10 15:45:08 INFO juju-log Unit workload member-state is offline with member-role unknown
2024-06-10T15:45:08.988Z [container-agent] 2024-06-10 15:45:08 INFO juju-log Attempting reboot from complete outage.
2024-06-10T15:45:09.899Z [container-agent] 2024-06-10 15:45:09 ERROR juju-log Failed to reboot cluster
2024-06-10T15:45:09.899Z [container-agent] Traceback (most recent call last):
2024-06-10T15:45:09.899Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 666, in _run_mysqlsh_script
2024-06-10T15:45:09.899Z [container-agent]     stdout, _ = process.wait_output()
2024-06-10T15:45:09.899Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 1540, in wait_output
2024-06-10T15:45:09.899Z [container-agent]     raise ExecError[AnyStr](self._command, exit_code, out_value, err_value)
2024-06-10T15:45:09.899Z [container-agent] ops.pebble.ExecError: non-zero exit code 1 executing ['/usr/bin/mysqlsh', '--no-wizard', '--python', '--verbose=1', '-f', '/tmp/script.py', ';', 'rm', '/tmp/script.py'], stdout='', stderr='Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory\nverbose: 2024-06-10T15:45:09Z: Loading startup files...\nverbose: 2024-06-10T15:45:09Z: Loading plugins...\nverbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:45:09Z: Shell.connect: tid=157: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=158: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306\nverbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=159: CONNECTED: nova-mysql-0.n' [truncated]
2024-06-10T15:45:09.899Z [container-agent] 
2024-06-10T15:45:09.899Z [container-agent] During handling of the above exception, another exception occurred:
2024-06-10T15:45:09.899Z [container-agent] 
2024-06-10T15:45:09.899Z [container-agent] Traceback (most recent call last):
2024-06-10T15:45:09.899Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1954, in reboot_from_complete_outage
2024-06-10T15:45:09.899Z [container-agent]     self._run_mysqlsh_script("\n".join(reboot_from_outage_command))
2024-06-10T15:45:09.899Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 669, in _run_mysqlsh_script
2024-06-10T15:45:09.899Z [container-agent]     raise MySQLClientError(e.stderr)
2024-06-10T15:45:09.899Z [container-agent] charms.mysql.v0.mysql.MySQLClientError: Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Loading startup files...
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Loading plugins...
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Shell.connect: tid=157: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=158: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=159: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Group Replication 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Metadata 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=160: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: Dba.reboot_cluster_from_complete_outage: tid=161: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:45:09.899Z [container-agent] No PRIMARY member found for cluster 'cluster-90686a958d50dbbe04045eddd9b73c47'
2024-06-10T15:45:09.899Z [container-agent] verbose: 2024-06-10T15:45:09Z: ClusterSet info: member, primary, not primary_invalidated, not removed from set, primary status: UNKNOWN
2024-06-10T15:45:09.899Z [container-agent] Restoring the Cluster 'cluster-90686a958d50dbbe04045eddd9b73c47' from complete outage...
2024-06-10T15:45:09.899Z [container-agent] 
2024-06-10T15:45:09.899Z [container-agent] ERROR: RuntimeError: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:45:09.899Z [container-agent] Traceback (most recent call last):
2024-06-10T15:45:09.899Z [container-agent]   File "<string>", line 2, in <module>
2024-06-10T15:45:09.899Z [container-agent] RuntimeError: Dba.reboot_cluster_from_complete_outage: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:45:09.899Z [container-agent] 
2024-06-10T15:45:09.899Z [container-agent] 
2024-06-10T15:45:09.907Z [container-agent] 2024-06-10 15:45:09 ERROR juju-log Failed to reboot cluster from complete outage.
2024-06-10T15:45:10.689Z [container-agent] 2024-06-10 15:45:10 ERROR juju-log Failed to get cluster status for cluster-90686a958d50dbbe04045eddd9b73c47
2024-06-10T15:45:10.727Z [container-agent] 2024-06-10 15:45:10 ERROR juju-log Failed to get cluster endpoints
2024-06-10T15:45:10.727Z [container-agent] Traceback (most recent call last):
2024-06-10T15:45:10.727Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 836, in update_endpoints
2024-06-10T15:45:10.727Z [container-agent]     rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
2024-06-10T15:45:10.727Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1469, in get_cluster_endpoints
2024-06-10T15:45:10.727Z [container-agent]     raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
2024-06-10T15:45:10.727Z [container-agent] charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
2024-06-10T15:45:11.129Z [container-agent] 2024-06-10 15:45:11 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)
2024-06-10T15:49:50.952Z [container-agent] 2024-06-10 15:49:50 INFO juju-log Unit workload member-state is offline with member-role unknown
2024-06-10T15:49:50.976Z [container-agent] 2024-06-10 15:49:50 INFO juju-log Attempting reboot from complete outage.
2024-06-10T15:49:52.247Z [container-agent] 2024-06-10 15:49:52 ERROR juju-log Failed to reboot cluster
2024-06-10T15:49:52.247Z [container-agent] Traceback (most recent call last):
2024-06-10T15:49:52.247Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 666, in _run_mysqlsh_script
2024-06-10T15:49:52.247Z [container-agent]     stdout, _ = process.wait_output()
2024-06-10T15:49:52.247Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 1540, in wait_output
2024-06-10T15:49:52.247Z [container-agent]     raise ExecError[AnyStr](self._command, exit_code, out_value, err_value)
2024-06-10T15:49:52.247Z [container-agent] ops.pebble.ExecError: non-zero exit code 1 executing ['/usr/bin/mysqlsh', '--no-wizard', '--python', '--verbose=1', '-f', '/tmp/script.py', ';', 'rm', '/tmp/script.py'], stdout='', stderr='Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory\nverbose: 2024-06-10T15:49:51Z: Loading startup files...\nverbose: 2024-06-10T15:49:51Z: Loading plugins...\nverbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:49:51Z: Shell.connect: tid=519: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local\nverbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=520: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306\nverbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000\nverbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=521: CONNECTED: nova-mysql-0.n' [truncated]
2024-06-10T15:49:52.247Z [container-agent] 
2024-06-10T15:49:52.247Z [container-agent] During handling of the above exception, another exception occurred:
2024-06-10T15:49:52.247Z [container-agent] 
2024-06-10T15:49:52.247Z [container-agent] Traceback (most recent call last):
2024-06-10T15:49:52.247Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1954, in reboot_from_complete_outage
2024-06-10T15:49:52.247Z [container-agent]     self._run_mysqlsh_script("\n".join(reboot_from_outage_command))
2024-06-10T15:49:52.247Z [container-agent]   File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 669, in _run_mysqlsh_script
2024-06-10T15:49:52.247Z [container-agent]     raise MySQLClientError(e.stderr)
2024-06-10T15:49:52.247Z [container-agent] charms.mysql.v0.mysql.MySQLClientError: Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Loading startup files...
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Loading plugins...
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Shell.connect: tid=519: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=520: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=521: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Group Replication 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Metadata 'group_name' value: 9b8a67b4-272e-11ef-a01b-96f2fe5f67b7
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=522: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Connecting to MySQL at: mysql://clusteradmin@nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306?connect-timeout=5000
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: Dba.reboot_cluster_from_complete_outage: tid=523: CONNECTED: nova-mysql-0.nova-mysql-endpoints.openstack.svc.cluster.local:3306
2024-06-10T15:49:52.247Z [container-agent] No PRIMARY member found for cluster 'cluster-90686a958d50dbbe04045eddd9b73c47'
2024-06-10T15:49:52.247Z [container-agent] verbose: 2024-06-10T15:49:51Z: ClusterSet info: member, primary, not primary_invalidated, not removed from set, primary status: UNKNOWN
2024-06-10T15:49:52.247Z [container-agent] Restoring the Cluster 'cluster-90686a958d50dbbe04045eddd9b73c47' from complete outage...
2024-06-10T15:49:52.247Z [container-agent] 
2024-06-10T15:49:52.247Z [container-agent] ERROR: RuntimeError: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:49:52.247Z [container-agent] Traceback (most recent call last):
2024-06-10T15:49:52.247Z [container-agent]   File "<string>", line 2, in <module>
2024-06-10T15:49:52.247Z [container-agent] RuntimeError: Dba.reboot_cluster_from_complete_outage: The current session instance does not belong to the Cluster: 'cluster-90686a958d50dbbe04045eddd9b73c47'.
2024-06-10T15:49:52.247Z [container-agent] 
2024-06-10T15:49:52.247Z [container-agent] 
2024-06-10T15:49:52.266Z [container-agent] 2024-06-10 15:49:52 ERROR juju-log Failed to reboot cluster from complete outage.
2024-06-10T15:49:53.045Z [container-agent] 2024-06-10 15:49:53 ERROR juju-log Failed to get cluster status for cluster-90686a958d50dbbe04045eddd9b73c47
2024-06-10T15:49:53.055Z [container-agent] 2024-06-10 15:49:53 ERROR juju-log Failed to get cluster endpoints

peer data:

  application-data:
      cluster-name: cluster-90686a958d50dbbe04045eddd9b73c47
      cluster-set-domain-name: cluster-set-90686a958d50dbbe04045eddd9b73c47
      leader_elected_count: "2"
      requested-secrets: '["operator-password", "key", "csr", "cert", "cauth", "chain"]'
      units-added-to-cluster: "1"
    local-unit:
      in-scope: true
      data:
        egress-subnets: 10.152.183.84/32
        ingress-address: 10.152.183.84
        log-rotate-manager-pid: "80"
        member-role: unknown
        member-state: offline
        private-address: 10.152.183.84
        unit-container-restarts: "1"
        unit-initialized: "True"
        unit-status: alive
github-actions[bot] commented 3 weeks ago

https://warthogs.atlassian.net/browse/DPE-4680