Closed shayancanonical closed 3 weeks ago
I added last week functionality to only pop that error once per event. But you're still going to get the error if you give charm_tracing a path to a cert but the path isn't valid, or if you're not passing a cert but tempo is expecting https communication.
Indeed, charm_tracing has no knowledge of the relations your charm has. All it knows is whether you're passing a cert_path to it or not. If yes, it attempts to use it to talk to an https endpoint and fails if either the file isn't there, or the server rejects the connection.
We're aware of a race due to the timing of the root span creation. Issue is, the charm sets up the cert after init (while handling the certificates-relation-something event), but charm_tracing does its thing BEFORE charm init, so it will always give an error the first time. It should go away with the next event though.
Are you seeing this event after the relation is healthy and the cert has been written to disk?
@mmkay we could consider making that error less verbose and just saying 'failed to send trace over https', and put the whole traceback in debug. WDYT?
Can confirm that we are seeing this error trace when the relation with the certificates operator has not yet been formed and the relation is not healthy if it has been formed
Consideration: would it be too convoluted for the trace_charm
decorator to have a healthcheck like kwarg (similar to how we do self.tracing.is_ready()
before returning a tracing endpoint), and only attempt to export traces to tempo if the healthcheck passes? the TracingEndpointRequirer
can expose a method that returns whether the tempo unit has HTTPS enabled, which can be used in the healthcheck property (on the client charm code) to confirm that the relation with the certificates operator exists and that the CA cert has been saved to disk. Furthermore, we can move the self.tracing.is_ready()
check to the healthcheck property as well
we can silence the warnings, that's not the issue, the question is how to do that without hiding too many actual errors. If your tracing instrumentation isn't sending traces, you need to know. Maybe getting a couple of false positives every time you are setting it up is unavoidable.
It feels hard to generalize what that healthcheck should look like, and you can already disable charm tracing by returning None:
from charms.tempo_k8s.v1.charm_tracing import trace_charm
@trace_charm(
tracing_endpoint="my_tracing_endpoint",
)
class MyCharm(CharmBase):
...
@property
def charm_tracing_ready():
# custom healthcheck
return tracing.is_ready() and self.certificates.is_ready() and self.has_written_cert_to_disk() and ... # custom checks
@property
def my_tracing_endpoint(self) -> Optional[str]:
if self.charm_tracing_ready:
return self.tracing.otlp_http_endpoint()
else:
return None # charm tracing disabled for the run
does this work for you? Or does it not solve the issue?
@PietroPasotti you're right, i hadnt thought of returning None
for the tracing endpoint if the charm is not ready to send traces (i.e. the certificates relation is not in place). however, how would the charm be able to tell if tempo-k8s is using a TLS cert from a cert operator? would it still make sense to have TracingEndpointRequirer
provide this information?
I think we have a neat solution cooking, stay tuned https://github.com/canonical/tempo-k8s-operator/pull/135
Bug Description
When tempo-k8s is related to a certificates operator, the charm_tracing lib produces a flood of errors when the client application (mysql) is related to tempo-k8s but not the certificates operator. My understanding is that the
trace_charm
decorator does not check if the relation with the certificates operator exists before trying to export spans to tempo-k8s (it always calls and utilizes theserver_cert
kwarg in thetrace_charm
decorator)To Reproduce
In a microk8s model: 1.git clone git@github.com:canonical/cos-lite-bundle.git
In an lxd model:
Environment
juju: 3.4.2 lxd: 5.21.1 LTS MicroK8s v1.27.13 revision 6744 cert_handler.py: v1.8
Relevant log output
When
None
is passed in asserver_cert
to thetrace_charm
decoratorWhen related with tempo-k8s, but no relation with
self-signed-certificates
while providing aserver_cert
kwarg:Additional context
No response