thanos-io / kube-thanos

Kubernetes specific configuration for deploying Thanos.
Apache License 2.0
520 stars 177 forks source link

DATA LOSS: thanos-compact deduplication is experimental and should not be enabled by default #290

Open jonasmatthias opened 1 year ago

jonasmatthias commented 1 year ago

Deduplication in thanos-compact should not be enabled by default because it is an experimental feature. The example configuration in kube-thanos enables offline deduplication in thanos-compact on Prometheus replicas but does not set the correct deduplication strategy. This leads to data loss as deduplication is irreversible.

The documentation explains

This is a common case when Prometheus HA replicas are used. You can enable this deduplication strategy via the --deduplication.func=penalty flag.

The description of the deduplication.replica-label flag in the code also clarifies that the default deduplication algorithm should NOT be used on HA prometheus replicas:

Label to treat as a replica indicator of blocks that can be deduplicated (repeated flag). This will merge multiple replica blocks into one. This process is irreversible. Experimental. When one or more labels are set, compactor will ignore the given labels so that vertical compaction can merge the blocks. Please note that by default this uses a NAIVE algorithm for merging which works well for deduplication of blocks with precisely the same samples like produced by Receiver replication. If you need a different deduplication algorithm (e.g one that works well with Prometheus replicas), please set it via --deduplication.func.

I learned about this via

Since #164 offline deduplication in the compactor is enabled by default on label prometheus_replica. But the flag --deduplication.func=penalty is not set.

https://github.com/thanos-io/kube-thanos/blob/6533e7c402ecedd81b68192585a897d6c13441d3/examples/all/manifests/thanos-compact-statefulSet.yaml#L40-L41

It might be better to deactivate offline deduplication by default because it is an experimental feature.