cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.
https://cortexmetrics.io/
Apache License 2.0
5.4k stars 781 forks source link

HA dedupe should not require a "cluster" label #1736

Closed bboreham closed 1 year ago

bboreham commented 4 years ago

I'm told the purpose of the cluster label is to allow differentiatiation between multiple Prometheus HA pairs in a tenant. Its name is set via -distributor.ha-tracker.cluster.

Cortex should support the scenario where there is only one HA pair in a tenant, by making the cluster label optional.

cstyan commented 4 years ago

@bboreham can you clarify the meaning of tenant here? Is it for a given user ID, or for one of those user ID's environments in their own networks. Such as a given k8s cluster or cloud region.

From how I'm reading it, the cluster label is still a requirement to differentiate between different HA pairs.

If you mean only a single HA pair for a user ID then I'd argue that we shouldn't support not requiring a cluster label as it's likely they will at some point have multiple HA pairs and if they didn't then remember to add a cluster label deduping would then stop working.

bboreham commented 4 years ago

Yes, what the code calls “user”, but sometimes “org”. Cortex is “multi-tenant”.

I’d argue that a significant percentage of users will never use multi-HA, and it’s no great problem to add the label for those that evolve in the way you describe.

weeco commented 4 years ago

I agree with Bryan, if you make it optional and set the default to let's say "unnamed" you could still add additional pairs.

We have ~20 tenants and we only use one HA pair in each of these. It's not a big deal to set it to something static though.

cstyan commented 4 years ago

Sorry I’m not sure I understand. You have one HA pair total, or one per cluster?

On Wednesday, October 23, 2019, Weeco notifications@github.com wrote:

I agree with Bryan, if you make it optional and set the default to let's say "unnamed" you could still add additional pairs.

We have ~20 tenants and we only use one HA pair in each of these. It's not a big deal to set it to something static though.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cortexproject/cortex/issues/1736?email_source=notifications&email_token=AAYYTHHUYPTFLDNMTXDTNMDQQAMLFA5CNFSM4JBX3SKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECAZAWA#issuecomment-545362008, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYYTHCJWUXGMIFRO5CCYELQQAMLFANCNFSM4JBX3SKA .

weeco commented 4 years ago

We have one HA Pair (prometheus-1, prometheus-2) in each (kubernetes) cluster.

cstyan commented 4 years ago

Right okay, that's the exact situation where the cluster label is required...

weeco commented 4 years ago

Right now yes, that's the issue all about I think. Just as Bryan said, there will be users like me who have exactly one Prometheus pair for each x-scope-orgid. We do set the cluster label to the same label for all pairs which is fine because there is always just one pair in each tenant.

pracucci commented 4 years ago

Cortex should support the scenario where there is only one HA pair in a tenant, by making the cluster label optional.

I do understand the use case but, given the minimum extra effort to also set the cluster label, I'm wondering if it's worth making it optional.

I'm thinking about cases where a tenant just have 1 HA pair (and thus technically wouldn't need the cluster label), then after some time setup another HA pair (I bet copy-pasting the config already in use) and forget that now the cluster label is required on both HA pairs, spending extra time debugging it.

I'm following up this conversation to understand what we should do with https://github.com/cortexproject/cortex/pull/1737 (either reiterate on it or close it)

cstyan commented 4 years ago

I'm thinking about cases where a tenant just have 1 HA pair (and thus technically wouldn't need the cluster label), then after some time setup another HA pair (I bet copy-pasting the config already in use) and forget that now the cluster label is required on both HA pairs, spending extra time debugging it.

This was my point against making the cluster label optional as well.

jtlisi commented 4 years ago

Would a good middle ground be adding a default cluster label for anything that does not include one? We could put the feature behind a flag, but it would enable the use case without being a significant amount of work.

pracucci commented 4 years ago

Would a good middle ground be adding a default cluster label for anything that does not include one?

Isn't risky? If the user has multiple HA pairs and erroneously forget to set the cluster label, it would take time to find the mistake because the deduplication will work but not in the expected way.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.