thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.04k stars 2.09k forks source link

Query: deduplication for receivers only setup #7807

Open lachruzam opened 1 day ago

lachruzam commented 1 day ago

Is your proposal related to a problem?

In a deployment based only on receivers (no sidecars or HA Prometheus instances), each replicated timeseries holds the same data or parts of it (for example, when an instance is not available for some period of time). While Thanos Query uses the penalty deduplication algorithm that fills in missing points only after the penalty window, this situation causes gaps even when the data is available in the other replica. In such cases, I expect the deduplication algorithm to return all available data points.

Describe the solution you'd like

Allow an option to change the deduplication algorithm by adding a new flag to the query command. As an alternative algorithm, the standard Prometheus deduplication algorithm could be used.

Additional context

Screenshots present the problem that occurs during receivers restart. without-dedup with-dedup

lachruzam commented 1 day ago

PR: https://github.com/thanos-io/thanos/pull/7808