wait until the first unit is online, and then run microk8s.kubectl -n model-name delete pod mysql-k8s-0 as soon as it goes online
wait until the unit comes back online
Expected behavior
The cluster should be able to recover performing a full-cluster crash recovery (even though there is only one member in the cluster). The two waiting units should not be considered as they are yet to be a part of the cluster.
Actual behavior
The cluster is stuck with one unit in offline and two units in waiting status
nova-mysql/0* maintenance idle 10.1.28.217 offline
nova-mysql/1 waiting idle 10.1.180.21 waiting to get cluster primary from peers
nova-mysql/2 waiting idle 10.1.190.214 waiting to get cluster primary from peers
Versions
Operating system: Ubuntu 22.04 LTS
Juju CLI: 3.5.4
Juju agent: 3.5.4
Charm revision: 180
Log output
Juju debug log:
unit-nova-mysql-0: 10:39:54 INFO unit.nova-mysql/0.juju-log Persisting configuration changes to file
unit-nova-mysql-0: 10:39:54 INFO unit.nova-mysql/0.juju-log Configuration change requires restart
unit-nova-mysql-0: 11:39:55 ERROR unit.nova-mysql/0.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 888, in <module>
main(MySQLOperatorCharm)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 551, in main
manager.run()
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 530, in run
self._emit()
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 519, in _emit
_emit_charm_event(self.charm, self.dispatcher.event_name)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/main.py", line 147, in _emit_charm_event
event_to_emit.emit(*args, **kwargs)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 348, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 860, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 536, in _on_config_changed
self.on[f"{self.restart.name}"].acquire_lock.emit()
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 348, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 860, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/rolling_ops/v0/rollingops.py", line 399, in _on_acquire_lock
self.charm.on[self.name].relation_changed.emit(relation, app=self.charm.app)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 348, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 860, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/rolling_ops/v0/rollingops.py", line 348, in _on_relation_changed
self.charm.on[self.name].process_locks.emit()
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 348, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 860, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/rolling_ops/v0/rollingops.py", line 384, in _on_process_locks
self.charm.on[self.name].run_with_lock.emit()
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 348, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 860, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/rolling_ops/v0/rollingops.py", line 415, in _on_run_with_lock
callback(event)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/./src/charm.py", line 449, in _restart
container.pebble.restart_services([MYSQLD_SAFE_SERVICE], timeout=3600)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 2201, in restart_services
return self._services_action('restart', services, timeout, delay)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 2224, in _services_action
change = self.wait_change(change_id, timeout=timeout, delay=delay)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 2254, in wait_change
return self._wait_change_using_wait(change_id, timeout)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/venv/ops/pebble.py", line 2282, in _wait_change_using_wait
raise TimeoutError(f'timed out waiting for change {change_id} ({timeout} seconds)')
ops.pebble.TimeoutError: timed out waiting for change 471 (3600 seconds)
unit-nova-mysql-0: 11:39:55 ERROR juju.worker.uniter.operation hook "config-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-nova-mysql-0: 11:39:57 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-nova-mysql-0: 11:40:02 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-nova-mysql-0: 11:40:03 INFO juju.worker.uniter.operation ran "config-changed" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:04 INFO juju.worker.uniter.operation ran "database-relation-joined" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:05 INFO juju.worker.uniter.operation ran "database-relation-joined" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:06 INFO juju.worker.uniter.operation ran "database-peers-relation-joined" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:07 INFO juju.worker.uniter.operation ran "database-relation-joined" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:08 INFO juju.worker.uniter.operation ran "database-relation-changed" hook (via hook dispatching script: dispatch)
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster status for cluster-b65b6fff3ec3a31de1d455381cc8497a
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster status for cluster-b65b6fff3ec3a31de1d455381cc8497a
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster status for cluster-b65b6fff3ec3a31de1d455381cc8497a
unit-nova-mysql-0: 11:40:10 ERROR unit.nova-mysql/0.juju-log database-peers:29: Failed to get cluster endpoints
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/src/mysql_k8s_helpers.py", line 786, in update_endpoints
rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 724, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-nova-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1872, in get_cluster_endpoints
raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
unit-nova-mysql-0: 11:40:11 INFO juju.worker.uniter.operation ran "database-peers-relation-changed" hook (via hook dispatching script: dispatch)
Steps to reproduce
microk8s.kubectl -n model-name delete pod mysql-k8s-0
as soon as it goes onlineExpected behavior
The cluster should be able to recover performing a full-cluster crash recovery (even though there is only one member in the cluster). The two waiting units should not be considered as they are yet to be a part of the cluster.
Actual behavior
The cluster is stuck with one unit in
offline
and two units inwaiting
statusVersions
Operating system: Ubuntu 22.04 LTS
Juju CLI: 3.5.4
Juju agent: 3.5.4
Charm revision: 180
Log output
Juju debug log:
Additional context