Open Abuelodelanada opened 10 months ago
Hi, SolQA noticed this issue several times in traefik charm rev 174 since last week. This might be a blocker for most of SQA deployment. One of the run here - https://solutions.qa.canonical.com/testruns/ac962bb7-80e7-4891-bb70-8ff2a63bf7e6 We deploy COS on top of microk8s 1.28 with tls, and is with juju 3.3.
At first glance, this looks like a pebble error after a can_connect guard, which we decided should lead to error status, because juju would retry and resolve.
If that is indeed the case, then the right thing to do would probably be to let juju resolve. https://discourse.charmhub.io/t/its-probably-ok-for-a-unit-to-go-into-error-state/13022
As discussed a bit in the our sync earlier this week. We have enabled retries in our tests in the hope that it would resolve.
in some cases it may be better but the sample size is still small. I am still seeing issues where it stays blocked until we time out after 3 hours. Here is the status log on one where its been flapping back and forth for 30 min:
17 Apr 2024 17:24:11Z workload active
17 Apr 2024 17:24:12Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:24:14Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:24:20Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:24:21Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:24:31Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:24:33Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:24:52Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:24:54Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:25:34Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:25:35Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:26:58Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:27:00Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:28:52Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:28:53Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:28:59Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:29:00Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:29:08Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:29:10Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:29:15Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:29:16Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:29:26Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:29:27Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:29:47Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:29:49Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:30:29Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:30:30Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:31:50Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:31:51Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:34:34Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:34:35Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:39:35Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:39:36Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:44:36Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:44:37Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:49:38Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:49:39Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:54:39Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:54:40Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 17:59:40Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 17:59:41Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 18:04:42Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 18:04:43Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
17 Apr 2024 18:09:43Z juju-unit executing running receive-ca-cert-relation-changed hook for ca/0
17 Apr 2024 18:09:45Z juju-unit error hook failed: "receive-ca-cert-relation-changed"
I can follow up with logs from this test after it times out, or logs from another test where we've had similar looping as they become available.
logs from traefik traefik.log
Bug Description
While deploying cos-lite (
edge
) using the TLS overlay I get the following error.To Reproduce
juju deploy cos-lite --channel=edge --trust --overlay ./tls-overlay.yaml
juju debug-log
Environment
3.1.5
Relevant log output
Additional context
No response