canonical / grafana-k8s-operator

https://charmhub.io/grafana-k8s
Apache License 2.0
6 stars 22 forks source link

hook failed: "grafana-source-relation-departed" #281

Open markbeierl opened 10 months ago

markbeierl commented 10 months ago

Bug Description

Deployed cos lite bundle to a model and had it running for several days. Attempted to destroy the model, and grafana agent went into error status with hook failed: "grafana-source-relation-departed"

To Reproduce

  1. juju add-model cos microk8s
  2. juju deploy cos-lite --trust
  3. juju deploy cos-configuration-k8s \ --config git_repo=https://github.com/canonical/sdcore-cos-configuration \ --config git_branch=main \ --config git_depth=1 \ --config grafana_dashboards_path=grafana_dashboards/sdcore/
  4. juju integrate cos-configuration-k8s grafana

I then performed cross model integrations with two other models on offer cos.prometheus:receive-remote-write and cos.loki:logging.

The first model, I performed a destroy model, however the second still remained with a relation.

I then destroyed the cos-lite model and this error occurred.

Environment

juju      3.1.6          24626  3.1/stable          canonical✓  -
microk8s  v1.27.7        6101   1.27-strict/stable  canonical✓  -

Relevant log output

unit-grafana-0: 16:59:34 INFO juju.worker.uniter awaiting error resolution for "relation-departed" hook
unit-grafana-0: 16:59:35 ERROR unit.grafana/0.juju-log grafana-source:13: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1312, in <module>
    main(GrafanaCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 613, in _on_grafana_source_relation_departed
    removed_source = self._remove_source_from_datastore(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 628, in _remove_source_from_datastore
    stored_sources = self.get_peer_data("sources")
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 731, in get_peer_data
    data = self._charm.peers.data[self._charm.app].get(key, "")  # type: ignore[attr-defined]
AttributeError: 'NoneType' object has no attribute 'data'
unit-grafana-0: 16:59:35 ERROR juju.worker.uniter.operation hook "grafana-source-relation-departed" (via hook dispatching script: dispatch) failed: exit status 1

Additional context

No response

markbeierl commented 10 months ago

Additional info. Decided to see what happens on intervention.

Performed juju resolve grafana/0 --no-retry 3 times, checked the status between each and noticed that it changed to

grafana/0*  error     idle   10.1.212.218         hook failed: "ingress-relation-broken"

With this in debug-log

unit-grafana-0: 17:15:15 INFO juju.worker.uniter awaiting error resolution for "relation-broken" hook
unit-grafana-0: 17:15:16 ERROR unit.grafana/0.juju-log ingress:10: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1312, in <module>
    main(GrafanaCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/traefik_route_k8s/v0/traefik_route.py", line 334, in _on_relation_broken
    self.on.ready.emit(event.relation)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 276, in _on_ingress_ready
    self._configure()
  File "./src/charm.py", line 436, in _configure
    if self._check_datasource_provisioning():
  File "./src/charm.py", line 403, in _check_datasource_provisioning
    grafana_datasources = self._generate_datasource_config()
  File "./src/charm.py", line 1023, in _generate_datasource_config
    for source_info in self.source_consumer.sources:
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 707, in sources
    stored_sources = self.get_peer_data("sources")
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 731, in get_peer_data
    data = self._charm.peers.data[self._charm.app].get(key, "")  # type: ignore[attr-defined]
AttributeError: 'NoneType' object has no attribute 'data'
unit-grafana-0: 17:15:16 ERROR juju.worker.uniter.operation hook "ingress-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
markbeierl commented 10 months ago

Next resolve --no-retry gives

unit-grafana-0: 17:16:34 ERROR unit.grafana/0.juju-log grafana-dashboard:23: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1312, in <module>
    main(GrafanaCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_dashboard.py", line 1390, in _on_grafana_dashboard_relation_broken
    self._remove_all_dashboards_for_relation(event.relation)
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_dashboard.py", line 1524, in _remove_all_dashboards_for_relation
    if self._get_stored_dashboards(relation.id):
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_dashboard.py", line 1557, in _get_stored_dashboards
    return self.get_peer_data("dashboards").get(str(relation_id), {})
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_dashboard.py", line 1572, in get_peer_data
    data = self._charm.peers.data[self._charm.app].get(key, "")  # type: ignore[attr-defined]
AttributeError: 'NoneType' object has no attribute 'data'
unit-grafana-0: 17:16:34 ERROR juju.worker.uniter.operation hook "grafana-dashboard-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
markbeierl commented 10 months ago

Two more resolves and

unit-grafana-0: 17:17:56 ERROR unit.grafana/0.juju-log metrics-endpoint:20: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1312, in <module>
    main(GrafanaCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 344, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 841, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-0/charm/venv/ops/framework.py", line 930, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 370, in _maybe_provision_own_dashboard
    self.init_dashboard_provisioning(dashboards_dir_path)
  File "./src/charm.py", line 521, in init_dashboard_provisioning
    self._configure()
  File "./src/charm.py", line 436, in _configure
    if self._check_datasource_provisioning():
  File "./src/charm.py", line 403, in _check_datasource_provisioning
    grafana_datasources = self._generate_datasource_config()
  File "./src/charm.py", line 1023, in _generate_datasource_config
    for source_info in self.source_consumer.sources:
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 707, in sources
    stored_sources = self.get_peer_data("sources")
  File "/var/lib/juju/agents/unit-grafana-0/charm/lib/charms/grafana_k8s/v0/grafana_source.py", line 731, in get_peer_data
    data = self._charm.peers.data[self._charm.app].get(key, "")  # type: ignore[attr-defined]
AttributeError: 'NoneType' object has no attribute 'data'
unit-grafana-0: 17:17:56 ERROR juju.worker.uniter.operation hook "metrics-endpoint-relation-broken" (via hook dispatching script: dispatch) failed: exit status 1
PietroPasotti commented 8 months ago

Issue is that the peer relation is apparently removed before the grafana-source one. So when the charm lib does self._charm.peers, which does self.model.get_relation(PEER), it gets a None. Solution: add guards in front of all self._charm.peers calls to early-exit if there is no peer relation (which hopefully only ever happens if the charm is being nuked)

phvalguima commented 3 months ago

Seeing this issue on grafana latest/stable, revision 113