canonical / postgresql-k8s-operator

A Charmed Operator for running PostgreSQL on Kubernetes
https://charmhub.io/postgresql-k8s
Apache License 2.0
10 stars 20 forks source link

Re-relating with `self-signed-certificates` charm causes an error state #706

Open kelkawi-a opened 1 month ago

kelkawi-a commented 1 month ago

Steps to reproduce

  1. Deploy 3 units of postgresql-k8s charm, channel 14/stable revision 381
  2. Deploy self-signed-certificates charm and relate them.
  3. Remove the relation, and re-add it.

Expected behavior

Both charms are in active/idle state.

Actual behavior

One of the postgresql-k8s units goes into an error state.

Versions

Operating system: Ubuntu 22.04.4 LTS

Juju CLI: 3.5.3-ubuntu-amd64

Juju agent: 3.5.3

Charm revision: 381, channel 14/stable

kubectl: Client Version: v1.30.4 Server Version: v1.26.15

Log Output

juju debug-log:

unit-postgresql-k8s-2: 19:13:51 DEBUG unit.postgresql-k8s/2.juju-log certificates:415: Emitting Juju event certificates_relation_changed.
unit-postgresql-k8s-2: 19:13:51 ERROR unit.postgresql-k8s/2.juju-log certificates:415: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/./src/charm.py", line 2097, in <module>
    main(PostgresqlOperatorCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 553, in main
    manager.run()
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 529, in run
    self._emit()
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 518, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name, self._juju_context)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 139, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 347, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 853, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 943, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1786, in _on_relation_changed
    for certificate_creation_request in self._requirer_csrs
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1529, in _requirer_csrs
    relation = self.model.get_relation(self.relationship_name)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/model.py", line 250, in get_relation
    return self.relations._get_unique(relation_name, relation_id)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/model.py", line 949, in _get_unique
    raise TooManyRelatedAppsError(relation_name, num_related, 1)
ops.model.TooManyRelatedAppsError: Too many remote applications on certificates (2 > 1)
unit-postgresql-k8s-2: 19:13:52 ERROR juju.worker.uniter.operation hook "certificates-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-postgresql-k8s-2: 19:13:52 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-postgresql-k8s-2: 19:13:56 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-postgresql-k8s-2: 19:14:08 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-postgresql-k8s-2: 19:14:09 DEBUG unit.postgresql-k8s/2.juju-log certificates:415: ops 2.16.0 up and running.
unit-postgresql-k8s-2: 19:14:09 DEBUG unit.postgresql-k8s/2.juju-log certificates:415: no relation on 'tracing': tracing not ready
unit-postgresql-k8s-2: 19:14:09 DEBUG unit.postgresql-k8s/2.juju-log certificates:415: Emitting Juju event certificates_relation_joined.
unit-postgresql-k8s-2: 19:14:09 ERROR unit.postgresql-k8s/2.juju-log certificates:415: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/./src/charm.py", line 2097, in <module>
    main(PostgresqlOperatorCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 553, in main
    manager.run()
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 529, in run
    self._emit()
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 518, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name, self._juju_context)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/main.py", line 139, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 347, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 853, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/framework.py", line 943, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 735, in wrapped_function
    return callable(*args, **kwargs)  # type: ignore
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/postgresql_k8s/v0/postgresql_tls.py", line 115, in _on_tls_relation_joined
    self._request_certificate(None)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 735, in wrapped_function
    return callable(*args, **kwargs)  # type: ignore
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/lib/charms/postgresql_k8s/v0/postgresql_tls.py", line 98, in _request_certificate
    if self.charm.model.get_relation(TLS_RELATION):
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/model.py", line 250, in get_relation
    return self.relations._get_unique(relation_name, relation_id)
  File "/var/lib/juju/agents/unit-postgresql-k8s-2/charm/venv/ops/model.py", line 949, in _get_unique
    raise TooManyRelatedAppsError(relation_name, num_related, 1)
ops.model.TooManyRelatedAppsError: Too many remote applications on certificates (2 > 1)

Patroni logs:

2024-09-24 19:17:35 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:17:45 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:16:55 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:17:05 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:17:15 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:17:24 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:16:15 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:16:25 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:16:35 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
2024-09-24 19:16:45 UTC [1223]: INFO: no action. I am (postgresql-k8s-2), a secondary, and following a leader (postgresql-k8s-0) 
syncronize-issues-to-jira[bot] commented 1 month ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5554.

This message was autogenerated

dragomirp commented 1 month ago

Looks like the terraform provider can create multiple integrations on an interfaces with limits. Created an issue for it: https://github.com/juju/terraform-provider-juju/issues/608

taurus-forever commented 1 month ago

Just for the history: we have made an intensive testing of steps-to-reproduce on localhost (even stressing 100% CPU). Cannot reproduce with Juju CLI (the interface limits works well). It looks like it is solely terraform-provider-juju issue.

@dragomirp should we keep this open? IMHO, resolve after CS team confirmation. Tnx!

taurus-forever commented 1 week ago

The current status dump:

The Juju team has confirmed the issue in Juju Terraform Provider (JTP) and prepared PR with the fix: https://github.com/juju/juju/pull/18288

@dragomirp consider to resolve this bugreport as nothing to do on the charm side. @kelkawi-a please continue from here with Juju team regarding the fixed JTP version release.