canonical / cos-proxy-operator

https://charmhub.io/cos-proxy
Apache License 2.0
2 stars 12 forks source link

cos-proxy fails on hook "downstream-logging-relation-changed" #82

Closed Sponge-Bas closed 3 months ago

Sponge-Bas commented 1 year ago

Bug Description

In SQA testrun 651a309e-a3a6-44ab-b8a7-7905303fbc0a, cos-proxy fails to install in hook "downstream-logging-relation-changed".

To Reproduce

To reproduce, deploy cos and then the charmed kubernetes bundle, which includes cos. This issue is not necessarily reproducible, we have seen this bundle deploy without issues before.

Environment

The environment is a juju maas controller hosting a charmed kubernetes deployment. This deployment is connected to cos, which is hosted on a microk8s, hosted on the same juju maas controller.

Relevant log output

In the debug-log we see the following message:

2023-09-06 15:57:34 ERROR unit.cos-proxy/0.juju-log server.go:325 downstream-logging:35: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.10/http/client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/usr/lib/python3.10/http/client.py", line 942, in connect
    self.sock = self._create_connection(
  File "/usr/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/usr/lib/python3.10/socket.py", line 833, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/./src/charm.py", line 475, in <module>
    main(COSProxyCharm)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/main.py", line 436, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/main.py", line 144, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/lib/charms/vector/v0/vector.py", line 225, in _on_log_relation_changed
    self.on.config_changed.emit(config=self.config)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/./src/charm.py", line 359, in _write_vector_config
    r = request.urlopen(dest)
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 1377, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

My guess is that there was a hiccup in the networking. If that is the case, though I would expect that to resolve when the hook is retried.

Additional context

More logs and configs can be found here: https://oil-jenkins.canonical.com/artifacts/651a309e-a3a6-44ab-b8a7-7905303fbc0a/index.html

PietroPasotti commented 1 year ago

we'll do some digging. But can you confirm that when the hook retries the exact same exception occurs?

Solution will probably be to include ConnectionRefusedError in the catch group at https://github.com/canonical/cos-proxy-operator/blob/main/src/charm.py#L382

this might occur because we're trying to push stuff to loki before loki is ready. Question is: should we check loki is ready and if not, avoid trying? If we set Blocked and exit, when are we going to retry pushing the vector config?

sed-i commented 3 months ago

The urlopen check is no longer in the code so shouldn't be an issue anymore.