canonical / traefik-route-k8s-operator

https://charmhub.io/traefik-route-k8s
Apache License 2.0
0 stars 3 forks source link

Emit revoked event on relation broken #23

Closed sed-i closed 1 year ago

sed-i commented 1 year ago

When grafana is re-related to traefik, traefik doesn't wipe the old yaml.

In the following example, the traefik container has yaml files for relations 5 and 8, even though relation 5 doesn't exist anymore:

$ jshu traefik/0 | grep "relation-id"
  - relation-id: 3
  - relation-id: 4
  - relation-id: 7
  - relation-id: 8

$ jsshc traefik traefik/0 ls /opt/traefik/juju
certificates.yaml
juju_ingress_ingress-per-unit_4_prometheus.yaml
juju_ingress_ingress_3_alertmanager.yaml
juju_ingress_traefik-route_5_grafana.yaml
juju_ingress_traefik-route_8_grafana.yaml

And traefik gently complains:

$ k logs -c traefik pods/traefik-0 -n test-traefik-deployed-after-metallb-zk3n

2023-01-05T19:31:15.221Z [traefik] time="2023-01-05T19:31:15Z" level=warning msg="HTTP router already configured, skipping" routerName=juju-test-traefik-deployed-after-metallb-zk3n-grafana-router filename=juju_ingress_traefik-route_8_grafana.yaml providerName=file
2023-01-05T19:31:15.221Z [traefik] time="2023-01-05T19:31:15Z" level=warning msg="HTTP service already configured, skipping" providerName=file filename=juju_ingress_traefik-route_8_grafana.yaml serviceName=juju-test-traefik-deployed-after-metallb-zk3n-grafana-service

This PR adds a revoked event so that it could be handled similarly to ipa and ipu from within traefik:

# lib
observe(rel_events.relation_broken, self._handle_relation_broken)
    # lib
    def _handle_relation_broken(self, event):
        self.on.data_removed.emit(event.relation)
# traefik charm
observe(ipa_events.data_removed, self._handle_ingress_data_removed)
    # traefik charm
    def _handle_ingress_data_removed(self, event: RelationEvent):
        """A unit has removed the data we need to provide ingress."""
-       # if this is because the relation is gone, we don't need to do anything
-        if isinstance(event, RelationBrokenEvent):
-           return
        self._wipe_ingress_for_relation(event.relation)

Side note: currently it seems that _wipe_ingress_for_relation never takes place on relation removal. Seems like this is a mistake.

sed-i commented 1 year ago

@rbarry82 I recall discussing a revoked event for traefik-route. Was there a reason not to have it?

sed-i commented 1 year ago

Hmm ok looks like the terminology is more subtle:

rbarry82 commented 1 year ago

Grafana wouldn't actually care about that event anyway (either the relation exists and it's the leader, or not -- the "leader only ingress" flow is uglier, since it also relies on leader electon/etc). The patch as it exists here is probably sufficient alongside observing it in Traefik.

Is this the cause of the 404?

sed-i commented 1 year ago

Grafana wouldn't actually care about that event anyway

This change is on the provider side - it's for traefik charm code to observe and wipe the yaml file associated with the relation.

Is this the cause of the 404?

No, the 404 is because of a missing TLS section in the route config (separate issue).

PietroPasotti commented 1 year ago

This seems like a good change. When I wrote route I didn't see the need for observing the removed event, but if traefik needs to clean up some files after a remote end leaves, then yes, this sounds like the way to do it.

Re the _wipe_ingress_for_relation early return, that also seems like an oversight. IIRC I added that check because that routine also attempts to wipe the relation databag, which if we're in a relation-broken context, might fail. So the proper solution to that issue is to clean up the traefik config files, but skip the databag wipe if we're answering a relation-broken.

Is this a draft still?

rbarry82 commented 1 year ago

Still a draft?

sed-i commented 1 year ago

Added utest. Ready for review.