grafana / pyroscope

Continuous Profiling Platform. Debug performance issues down to a single line of code
https://grafana.com/oss/pyroscope/
GNU Affero General Public License v3.0
9.64k stars 574 forks source link

feat: optimize series order #3345

Closed kolesnikovae closed 3 weeks ago

kolesnikovae commented 3 weeks ago

Series order determines the physical placement of profiling data in blocks. It has been observed that the standard prometheus label and series order is not aligned with our access patterns, leading to an increase in the read amplification factor and query latency.

The PR proposes an option to enforce the optimized order. The main access pattern is a query that targets a specific profile/sample type of a specific service, therefore we want all the profiles matching such query to be stored sequentially. To achieve this, we ensure that the first labels of any series are __profile_type__ and __service_name__.

The change should be applied with great caution. Since we can only enforce the order in ingesters (__profile_type__ is not available in distributors), we need to ensure that all ingester instances either apply the optimization for a given profile, or not. Otherwise, there's possibility that the profile replicas will be handled differently, and the same series will have two representations: one with the standard order, and another one optimized; this will make it challenging or impossible to resolve the conflicts at query time and compaction. This is solved by "marking" profiles with the new private label __order__ in distributors, which indicates whether we need to enforce the new order, and is removed before ingestion. This will result in doubling the active series temporarily but will allow for the normal deduplication process.

N.B.: No changes are required in the query path, because the order does not affect "user" labels. For the Series call that lists profile types and services, the order does not change.

Thus, the change should be applied in two steps:

  1. Rollout the new version: all ingester instances must run the new version.
  2. Enable the enforce_labels_order limit.

Eventually, this optimization should become the default behaviour.

The optimization is only recommended if you are using user-defined labels (static or dynamic) that begin with _ or with a capital letter and are experiencing very poor query performance.

simonswine commented 3 weeks ago

Ok I have found the culprit, we probably need to loop throught he __order__ label to that point and remove it there:

--- a/pkg/phlaredb/tsdb/index.go
+++ b/pkg/phlaredb/tsdb/index.go
@@ -298,7 +298,7 @@ func (shard *indexShard) add(metric []*typesv1.LabelPair, fp model.Fingerprint)
                values.fps[fingerprints.value] = fingerprints
                internedLabels[i] = &typesv1.LabelPair{Name: values.name, Value: fingerprints.value}
        }
-       sort.Sort(internedLabels)
+       sort.Sort(phlaremodel.LabelsEnforcedOrder(internedLabels))
        return internedLabels
 }
simonswine commented 3 weeks ago

Update: I have pushed a related test and fix https://github.com/grafana/pyroscope/pull/3345/commits/ece08961d3ee88fcb3401d6b31a97174a75c0737

kolesnikovae commented 3 weeks ago

I've also noticed that the fix does not have any effect at the last moment. The issue is that initially, I modified the default labels order (therefore this last sort had no effect, and the order on disk was enforced), but then I made changes for migration that broke it – I was planning to look into this today. Thank you @simonswine for fixing this!