apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.75k stars 6.51k forks source link

[Feature] Add execution tracing with performance for MQE engine and BanyanDB storage #12249

Closed wu-sheng closed 2 months ago

wu-sheng commented 4 months ago

Search before asking

Description

By adopting BanyanDB as first-class storage, we are going to keep adding observability to ourself, as we have more capability to optimize performance. So here, after @hanahmily and I had a discussion, I want to propose a new debugging tool for MQE query.ourselves

Use case

MQE is the most important query engine right now for several versions. Since v10, MQE + BanyanDB storage are recommended as always. In order to help end users, we hope to diagnose the performance easier in collecting context and end-to-end performance costs.

To implement that, we need to add things as following

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

Code of Conduct

wu-sheng commented 4 months ago

As most queries are very fast, we are going to use ns(1e+9 of a second) as the time unit. Less than 1 ns could be ignored.

wu-sheng commented 4 months ago

For example, when we apply for MQE(service_percentile{p='50,75'} - avg(service_percentile{p='50,75'})), the execution tracing should look like

- MQE expression, service_percentile{p='50,75'} - avg(service_percentile{p='50,75'})
  - duration: 100 ns
  - queries
     - MQE syntax analysis
        - duration: 10ns
        - error:
     - readMetricsValues(service_percentile)
        - duration:
        - server-side traces....
     - /* Multiple metrics queries if needed. */
     - /* We need to consider to flag concurrency queries if MQE supported to run in that mode. */
     - In-memory calculation
        - error:
        - duration:
wu-sheng commented 4 months ago

Besides the server-side(OAP) response, UI should trace the query cost from browser perspective, which provide extra information whether the query is slow due to pending on network or HTTP server queue.

hanahmily commented 4 months ago

BanyanDB server-side is related to #10561.

wu-sheng commented 3 months ago

BanyanDB server-side is related to #10561.

I have approved that. It should be easy enough to be integrated. The only thing is, @hanahmily we need the new client version to make the codes work.