cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.
https://cortexmetrics.io/
Apache License 2.0
5.48k stars 801 forks source link

Dynamic copy management to solve the problem of unbalanced load(mem) of Ingester nodes #5071

Open wgliang opened 1 year ago

wgliang commented 1 year ago

Is your feature request related to a problem? Please describe.

The current copy of Cortex is implemented through consistent hashing, and the location of the copy (Inageter node) is determined at the beginning of the metric data writing. In the case that some metric samples are relatively large, the load of different Ingester nodes will vary greatly.

Describe the solution you'd like Whether we can implement dynamic replicas (shards), we can dynamically schedule between different Ingester nodes based on size, time slice, load, etc.

Describe alternatives you've considered

Additional context

friedrichg commented 1 year ago

I am curious if you are experiencing this with shard-by-all-labels set to true

# Distribute samples based on all labels, as opposed to solely by user and
# metric name.
# CLI flag: -distributor.shard-by-all-labels
[shard_by_all_labels: <boolean> | default = false]
jeromeinsf commented 1 year ago

which configuration are you using for https://cortexmetrics.io/docs/configuration/arguments/#distributor ?

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.