amargherio / mechanic

An AKS node monitor for planned maintenance events
Apache License 2.0
0 stars 1 forks source link

Improve observability #16

Open amargherio opened 2 weeks ago

amargherio commented 2 weeks ago

OpenTelemetry would provide the ability to grab metrics and tracing from mechanic and surface it, if desired, for better insight into what's going on inside. It'd also offer the ability to use correlation IDs to associate logs back to specific triggering node updates.

Update 2024-Oct-2 Adding some more detail here on how this implementation probably looks:

We can put a few milestones together based on the above:

amargherio commented 1 week ago

Potentially gets resolved as part of #19