perf: eliminate flamegraph merge

It has been observed that in a distributed scenario with high parallelism of sub-range queries, the query frontend quickly becomes a bottleneck. In the screenshot below, you can see that approximately 5.3 seconds out of a total of 7 were spent on flame graph merges:

The problem is that all sub-range query results are merged synchronously in a single thread. A proper solution to this problem would be enabling DAG query execution, so that merges are distributed among queriers, which is planned for the near future.

However, this process could be vastly optimized by using trees instead of flame graphs (which are particularly inefficient to merge) in the query-frontend to queries requests.

grafana / pyroscope

perf: eliminate flamegraph merge #3349