observatorium / thanos-receive-controller

Kubernetes controller to automatically configure Thanos receive hashrings
Apache License 2.0
93 stars 42 forks source link
controller kubernetes prometheus remote-write thanos

Thanos Receive Controller

The Thanos Receive Controller configures multiple hashrings of Thanos receivers running as StatefulSets on Kubernetes.
Based on an initial mapping of tenants to hashrings, the controller identifies the Pods in each hashring and generates a complete configuration file as a ConfigMap.

Build Status

Getting Started

First, provide an initial mapping of tenants to hashrings in a ConfigMap, e.g.:

cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
  name: thanos-receive
    app.kubernetes.io/name: thanos-receive
  hashrings.json: |
            "hashring": "hashring0",
            "tenants": ["foo", "bar"]
            "hashring": "hashring1",
            "tenants": ["baz"]

Next, deploy the controller, pointing it at the configuration file in the ConfigMap:

cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
  name: thanos-receive-controller
    app.kubernetes.io/name: thanos-receive-controller
  replicas: 1
      app.kubernetes.io/name: thanos-receive-controller
        app.kubernetes.io/name: thanos-receive-controller
      - args:
        - --configmap-name=thanos-receive
        - --configmap-generated-name=thanos-receive-generated
        - --file-name=hashrings.json
        image: quay.io/observatorium/thanos-receive-controller
        name: thanos-receive-controller

Finally, deploy StatefulSets of Thanos receivers labeled with controller.receive.thanos.io=thanos-receive-controller. The controller lists all of the StatefulSets with that label and matches the value of their controller.receive.thanos.io/hashring labels to the hashring names in the configuration file. The endpoints for each hashring will be populated automatically by the controller and the complete configuration file will be placed in a ConfigMap named thanos-receive-generated. This configuration should be consumed as a ConfigMap volume by the Thanos receivers.

About the --allow-only-ready-replicas flag

By default, upon a scale up, the controller adds all new receiver replicas into the hashring as soon as they are in a running state. However, this means the new replicas will be receiving requests from other replicas in the hashring before they are ready to accept them. Due to the nature of how receiver works, it can take some time until receiver's storage is ready. Depending on your roll out strategy, you might see an increased failure rate in your hashring until enough replicas are in a ready state.

An alternative is to use the --allow-only-ready-replicas, which modifies this behavior. Instead, upon a scale-up, new replicas are added only after it is confirmed they are ready. This means:

About the --allow-dynamic-scaling flag

By default, the controller does not react to voluntary/involuntary disruptions to receiver replicas in the StatefulSet. This flag allows the user to enable this behavior. When enabled, the controller will react to voluntary/involuntary disruptions to receiver replicas in the StatefulSet. When a Pod is marked for termination, the controller will remove it from the hashring and the replica essentially becomes a "router" for the hashring. When a Pod is deleted, the controller will remove it from the hashring. When a Pod becomes unready, the controller will remove it from the hashring. This behaviour can be considered for use alongside the Ketama hashing algorithm.

About the --use-az-aware-hashring flag

By default, the controller does not support az aware hashring introduced in Thanos v0.32+ (https://thanos.io/tip/components/receive.md/#az-aware-ketama-hashring-experimental), This flag allows the user to enable this behaviour. When enabled, the controller will generate az aware hashring configuration based on the --pod-az-annotation-key flag, namely the value of the annotation key will be used as the az name for each pod. If not specified, the statefulset name will be used as AZ field. Note that Thanos has be upgraded to v0.32+ to work with new hashring endpoint struct.