elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.81k stars 8.2k forks source link

Attach a long event loop delay "span" to an APM transaction #128647

Open mshustov opened 2 years ago

mshustov commented 2 years ago

Since v7.14 PR, Kibana reports a warning if the mean value of event loop delay exceeds 350ms. It helps users spot a performance problem but not investigate it since the runtime context is absent.

To overcome the problem, we can borrow a few ideas from [this article] (https://www.ashbyhq.com/blog/engineering/detecting-event-loop-blockers). TLDR: Server can capture the runtime context of the expensive tasks by implementing a custom async hook tracking the duration of a task and attaching it to an APM transaction. It allows Cloud customers quickly identify what APM transaction triggers CPU-bound tasks on the Kibana server.

elasticmachine commented 2 years ago

Pinging @elastic/kibana-core (Team:Core)

gsoldevila commented 9 months ago

Status update

gsoldevila commented 9 months ago

The article mentioned in the description pinpoints 2 main scenarios that can cause event loop delays:

Perhaps rather than trying to detect the blocks with a timer-based strategy, we could try to calculate them per request. We know that most of the flows are [Browser => ] Kibana => ES

  1. We identify the entrypoints of the different use cases.
  2. AsyncLocalStorage could help us keep track of the total request time, and then subtracting the time we are doing HTTP requests to ES, effectively obtaining the "self" time for each use case.
  3. We can WARN when the self time exceeds a certain threshold.

UPDATE: Unfortunately, this does not guarantee that a "blocked" request is the culprit.

pgayvallet commented 3 months ago

@gsoldevila is there anything we can do on this issue, or should we just close it as won't do?