Open pracucci opened 2 years ago
I'm guessing that we need to make this two phase ? First tell queriers to start asking the new ingesters and then start writing to the ingesters ?
Or possibly one phase but include a warmup time checkpoint, before which distributors should use the old ring?
Describe the bug
I've investigated some query results failures reported by
mimir-continuous-test
tool and I've found an edge case where Mimir could temporarily return partial query results on the latest data points when a tenant ingesters shard size is increased.Scenario:
ConfigMap
)Actual outcome:
Investigation
The issue is due to the fact that applying a change to runtime config is not an atomic operation across multiple replicas (there's
ConfigMap
update delay + Mimir periodic polling of runtime config). If the change is applied to some distributors before all queriers, the most recent data points are written to new ingesters (because we're increasing the shard size) but queriers are not querying them yet (because they're not aware we increased the shard size yet). This cause partial query results on the most recent data points while the shard size increase is rolling out.