Open tomwardill opened 3 years ago
We should avoid using the k8s DNS because that won't be available outside of k8s in the case where we're using cross-model relations, but we should definitely investigate/fix this.
Hey Tom and Tom :smiley:. Interesting, when writing this I was actually not sure what do to, that is why the code is a bit convoluted in there, falling back self.app.name
, because I knew it was valid if bind_address
was not available.
@mthaddon do you think just using app.name
would be enough here or it is more complicated than that?
I think the charm should be looking at whether its IP has changed as part of hooks that might fire whenever this has happened. If it has changed, it should publish that change on the relation so relating charms can take appropriate action. We may need some help from the juju team to identify which hook(s) would be relevant here, but we've observed the following hooks running on a pod after a restart event that caused an IP change:
$ kubectl logs -n prod-events-k8s redis-broker-0 -c charm | grep ' ran ' | head -20
2023-06-02T04:09:46.486Z [container-agent] 2023-06-02 04:09:46 INFO juju.worker.uniter.operation runhook.go:159 ran "start" hook (via hook dispatching script: dispatch)
2023-06-02T04:09:59.610Z [container-agent] 2023-06-02 04:09:59 INFO juju.worker.uniter.operation runhook.go:159 ran "start" hook (via hook dispatching script: dispatch)
2023-06-02T04:10:15.313Z [container-agent] 2023-06-02 04:10:15 INFO juju.worker.uniter.operation runhook.go:159 ran "sentinel-pebble-ready" hook (via hook dispatching script: dispatch)
2023-06-02T04:10:50.723Z [container-agent] 2023-06-02 04:10:50 INFO juju.worker.uniter.operation runhook.go:159 ran "redis-pebble-ready" hook (via hook dispatching script: dispatch)
2023-06-02T04:11:22.162Z [container-agent] 2023-06-02 04:11:22 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:11:42.011Z [container-agent] 2023-06-02 04:11:41 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:12:37.407Z [container-agent] 2023-06-02 04:12:37 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:13:37.572Z [container-agent] 2023-06-02 04:13:37 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:14:31.409Z [container-agent] 2023-06-02 04:14:31 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:15:39.592Z [container-agent] 2023-06-02 04:15:39 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:15:55.376Z [container-agent] 2023-06-02 04:15:55 INFO juju.worker.uniter.operation runhook.go:159 ran "sentinel-pebble-ready" hook (via hook dispatching script: dispatch)
2023-06-02T04:16:53.755Z [container-agent] 2023-06-02 04:16:53 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:17:57.675Z [container-agent] 2023-06-02 04:17:57 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:18:58.588Z [container-agent] 2023-06-02 04:18:58 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:19:48.832Z [container-agent] 2023-06-02 04:19:48 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:20:50.449Z [container-agent] 2023-06-02 04:20:50 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:21:23.950Z [container-agent] 2023-06-02 04:21:23 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:22:23.503Z [container-agent] 2023-06-02 04:22:23 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:23:15.746Z [container-agent] 2023-06-02 04:23:15 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
2023-06-02T04:24:18.254Z [container-agent] 2023-06-02 04:24:18 INFO juju.worker.uniter.operation runhook.go:159 ran "update-status" hook (via hook dispatching script: dispatch)
Sounds like we'd either need to update the IP on the relation as part of the start
hook or the redis-pebble-ready
hook.
https://github.com/canonical/redis-operator/blob/master/src/charm.py#L165
This address can change if the pod is killed and respawned. I think a better solution would be to use either the Service IP, or the app name. The app name is valid DNS in k8s that have DNS running and the lookup moves with the pod.
There may be a more preferred juju method of doing this that I'm not aware of.
https://bugs.launchpad.net/juju/+bug/1911135 looks to be related.