Closed PietroPasotti closed 5 months ago
First of all I wanted to check how easy is it to switch to v2. I took prometheus as one of the charms that I recently updated tracing in. My branch is here: https://github.com/canonical/prometheus-k8s-operator/tree/tracing-v2 . Commands to deploy tempo and prometheus:
juju deploy --trust ./tempo-k8s_ubuntu-22.04-amd64.charm tempo --resource tempo-image=grafana/tempo:1.5.0
juju deploy --trust ./prometheus-k8s_ubuntu-20.04-amd64.charm --resource prometheus-image=ubuntu/prometheus:2-22.04 && juju integrate tempo prometheus-k8s:tracing
There might be a bit of a race condition, as although I initialized tracing using
self.tracing = TracingEndpointRequirer(self, protocols=["otlp_http"])
on the endpoint method:
@property
def tempo(self) -> Optional[str]:
"""Tempo endpoint for charm tracing."""
return self.tracing.get_endpoint("otlp_http")
I got this exception:
unit-prometheus-0: 17:06:42 WARNING unit.prometheus/0.juju-log <class '__main__.PrometheusCharm'>.<property object at 0x7f943cccd590> returned None; continuing with tracing DISABLED.
unit-prometheus-k8s-0: 17:07:09 INFO juju.worker.uniter awaiting error resolution for "install" hook
unit-prometheus-k8s-0: 17:07:10 INFO unit.prometheus-k8s/0.juju-log Running legacy hooks/install.
unit-prometheus-k8s-0: 17:07:10 ERROR unit.prometheus-k8s/0.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
File "./src/charm.py", line 1074, in <module>
main(PrometheusCharm)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/ops/main.py", line 444, in main
charm = charm_class(framework)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 319, in wrap_init
tracing_endpoint = _get_tracing_endpoint(tracing_endpoint_getter, self, charm)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 241, in _get_tracing_endpoint
tracing_endpoint = tracing_endpoint_getter.__get__(self)
File "./src/charm.py", line 1065, in tempo
return self.tracing.get_endpoint("otlp_http")
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/lib/charms/tempo_k8s/v2/tracing.py", line 744, in get_endpoint
raise ProtocolNotRequestedError(protocol, relation)
charms.tempo_k8s.v2.tracing.ProtocolNotRequestedError: ('otlp_http', None)
After I added if self.tracing.is_ready():
I was still getting connection issues.
unit-prometheus-k8s-0: 17:36:03 ERROR unit.prometheus-k8s/0.juju-log Exception while exporting Span batch.
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connection.py", line 198, in _new_conn
sock = connection.create_connection(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connectionpool.py", line 793, in urlopen
response = self._make_request(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connection.py", line 400, in request
self.endheaders()
File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
self.send(msg)
File "/usr/lib/python3.8/http/client.py", line 951, in send
self.connect()
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connection.py", line 238, in connect
self.sock = self._new_conn()
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connection.py", line 213, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd2342b4640>: Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/connectionpool.py", line 847, in urlopen
retries = retries.increment(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='tempo-0.tempo-endpoints.cos.svc.cluster.local', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2342b4640>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/opentelemetry/sdk/trace/export/__init__.py", line 368, in _export_batch
self.span_exporter.export(self.spans_list[:idx]) # type: ignore
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 145, in export
resp = self._export(serialized_data)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/opentelemetry/exporter/otlp/proto/http/trace_exporter/__init__.py", line 114, in _export
return self._session.post(
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/var/lib/juju/agents/unit-prometheus-k8s-0/charm/venv/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='tempo-0.tempo-endpoints.cos.svc.cluster.local', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2342b4640>: Failed to establish a new connection: [Errno 111] Connection refused'))
Relation data from jhack show-relation prometheus-k8s tempo:tracing
:
│ │
│ ╭───────────────────────────────────────── locals ─────────────────────────────────────────╮ │
│ │ local_endpoint = 'tracing' │ │
│ │ matches = [] │ │
│ │ obj = 'tempo:tracing' │ │
│ │ other_obj = 'prometheus-k8s' │ │
│ │ relation = Relation( │ │
│ │ │ provider='tempo', │ │
│ │ │ provider_endpoint='tracing', │ │
│ │ │ requirer='prometheus-k8s', │ │
│ │ │ requirer_endpoint='tracing', │ │
│ │ │ interface='tracing', │ │
│ │ │ raw_type='regular' │ │
│ │ ) │ │
│ │ relations = [ │ │
│ │ │ { │ │
│ │ │ │ 'relation-id': 1, │ │
│ │ │ │ 'endpoint': 'tracing', │ │
│ │ │ │ 'related-endpoint': 'tracing', │ │
│ │ │ │ 'application-data': { │ │
│ │ │ │ │ 'host': '"tempo-0.tempo-endpoints.cos.svc.cluster.local"', │ │
│ │ │ │ │ 'receivers': '[{"protocol": "otlp_http", "port": 4318}]' │ │
│ │ │ │ }, │ │
│ │ │ │ 'local-unit': {'in-scope': False, 'data': None}, │ │
│ │ │ │ 'related-units': { │ │
│ │ │ │ │ 'tempo/0': { │ │
│ │ │ │ │ │ 'in-scope': True, │ │
│ │ │ │ │ │ 'data': { │ │
│ │ │ │ │ │ │ 'egress-subnets': '10.152.183.113/32', │ │
│ │ │ │ │ │ │ 'ingress-address': '10.152.183.113', │ │
│ │ │ │ │ │ │ 'private-address': '10.152.183.113' │ │
│ │ │ │ │ │ } │ │
│ │ │ │ │ } │ │
│ │ │ │ } │ │
│ │ │ } │ │
│ │ ] │ │
│ │ remote_endpoint = None │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Maybe there's an issue with port opening? It doesn't look like 4318
is exposed by the service:
$ kubectl get services -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 31h
kube-system kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 31h
metallb-system webhook-service ClusterIP 10.152.183.24 <none> 443/TCP 31h
controller-microk8s controller-service ClusterIP 10.152.183.46 <none> 17070/TCP 30h
controller-microk8s modeloperator ClusterIP 10.152.183.209 <none> 17071/TCP 30h
pietro modeloperator ClusterIP 10.152.183.162 <none> 17071/TCP 6m34s
pietro tempo ClusterIP 10.152.183.63 <none> 65535/TCP 6m12s
pietro tempo-endpoints ClusterIP None <none> <none> 6m11s
pietro prometheus-k8s-endpoints ClusterIP None <none> <none> 3m42s
pietro prometheus-k8s ClusterIP 10.152.183.115 <none> 9090/TCP 3m43s
while the one from edge seems to expose a ton of ports:
pietro tempo-k8s ClusterIP 10.152.183.99 <none> 3200/TCP,4317/TCP,4318/TCP,9411/TCP,14268/TCP,14250/TCP 40s
Tandem PR: https://github.com/canonical/charm-relation-interfaces/pull/136