elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.47k stars 8.04k forks source link

[task manager][meta] improve task manager performance logging #109941

Open pmuellr opened 2 years ago

pmuellr commented 2 years ago

PR https://github.com/elastic/kibana/pull/109741 removes some of the "observability of task manager", by "hiding" the "potential performance problem" log warning, by turning it into a debug warning.

The original issue that spawned the PR is here https://github.com/elastic/kibana/issues/109095 and contains references to other issues where this message has appeared and caused undo alarm.

It would be nice to "promote" this message back to a "warn", but I think we need to feel pretty confident that the message is only logged when we really know we have a problem.

Some specific problems we've seen:

pmuellr commented 2 years ago

I should mention, it's possible that some of the changes made in PR https://github.com/elastic/kibana/pull/109741 will end up improving the situation - for instance, cutting down on the number of times the message is generated over time (because we relaxed the conditions considered problematic). But I think we'll need to see over time.

elasticmachine commented 2 years ago

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

pmuellr commented 2 years ago

I think this is going to become a meta issue, realized I needed a place to vent on my other concerns regarding the task manager perf logging:

pmuellr commented 2 years ago

I was also thinking yesterday it might be useful to make use of the event log. But I think it would have to be conditional, otherwise it's going to get REAL busy.

What would we add? Of that, I'm not sure. It might be a good place to put the health documents, but I think they would have to be a new object/enabled: false field. Or perhaps flattened. Maybe a different shape that would be better for KQL queries. Task start/ends documents might be good. Wondering if we could use this to do better estimation of the number of active Kibanas.