vishal-biyani commented 3 years ago

A lot of ground work of the details in this issue has been done by @kanuahs in a Doc he shared and most of diagrams are made by him. In most scenarios - he is original author and I am only building on his work here.

We will start with some examples of typical deployment blueprints so we get a sense of details and then we will jump on to define a spec for blueprint.

Global View only

Global view is one of simplest blueprints - simply aggregates the metrics from a set of Prometheus servers and shows them with Grafana UI.
In fact there is no storage component here, we are only querying the current metrics.
[ ] We need to confirm if there is a real scenario where the Prometheus servers are in different clusters and the Thanos in a separate place. Due to considerations such as latency etc. this looks more of a single cluster or at max within a AZ kind of setup.
[ ] Also how realistic is this scenario - for ex. without the S3 config. etc.

So in this blueprint, we can install Prometheus, inject side car. The Thanos part involves a querier and pointing it to all Prometheus servers and then linking Grafana to thanos querier endpoint.

global-view-only

// Copy and paste in https://flowchart.fun Code:
Grafana
  PromQL: Query
    StoreAPI: Thanos Sidecar + Prometheus
    StoreAPI: Thanos Sidecar + Prometheus
    StoreAPI: Thanos Sidecar + Prometheus

Sidecar

This is a good example of a pretty standard sidecar based setup across clusters. The sidecar sit next to Prometheus but rest of components sit in a different cluster potentially. Also the querier talks to store gateway as well as Thanos sidecar to delivery results of queries in near real time. And compactor runs separately.

sidecar

// Copy and paste in https://flowchart.fun:
Grafana
  PromQL: Query
    StoreAPI: (Store)
    StoreAPI: Thanos Sidecar
      Upload: (Bucket)
      Fetch Blocks: Prometheus
        Scrape: (Targets)

[Store] Store Gateway
  [Bucket] query: Object Storage

[Compact] Compact
  Downsampling & Retention: (Bucket)

[Targets] Exporters & Scrape Targets

Receiver

One drawback of the sidecar approach is that the Prometheus endpoint has to be accessible outside the cluster - which could be an issue for many use cases. The thanos Receiver inverts the flow and Prometheus writes to a remote Thanos
Rest of components are very similar

receiver

// Copy and paste in https://flowchart.fun:
Grafana
  PromQL: Query
    StoreAPI: (Store)
    [Receiver] StoreAPI: Thanos Receiver
      Upload: (Bucket)

[Store] Store Gateway
  [Bucket] query: Object Storage

[Compact] Compact
  Downsampling & Retention: (Bucket)

Scrape: Exporters & Scrape Targets
  Prometheus
    Remote Write: (Receiver)

Multi-Tenant

This is again similar to (3) with two changes:
- The Auth Proxy between Prometheus's remote write and Thanos receiver.
- Multiple Prometheus instances instead of one.
- The OAuth Proxy is the one maintained here: https://github.com/oauth2-proxy/oauth2-proxy

multitenant

// Copy and paste in https://flowchart.fun:
Grafana
  PromQL: Query
    StoreAPI: (Store)
    [Receiver] StoreAPI: Thanos Receiver
      Upload: (Bucket)

[Store] Store Gateway
  [Bucket] query: Object Storage

[Compact] Compact
  Downsampling & Retention: (Bucket)

Scrape: Exporters & Scrape Targets
  Prometheus
    Remote Write: Auth Proxy
      Write with Auth Headers: (Receiver)

Scrape: Exporters & Scrape Targets
  Prometheus
    Remote Write: Auth Proxy
      Write with Auth Headers: (Receiver)

Scrape: Exporters & Scrape Targets
  Prometheus
    Remote Write: Auth Proxy
      Write with Auth Headers: (Receiver)

vishal-biyani commented 3 years ago

After the study of blueprints for various use cases above, let's go one level deeper in each component and what they can do. Let's use the diagram below and talk about each of components.

IMG_20210513_104544__01

Terminology:

We will call the K8S cluster where Prometheus is as "source" cluster
We will assume that Thanos and related components are in a different "destination" cluster
There will be use cases though, where source and destination clusters are one and the same.

Sending Data from Prometheus to Object Storage

So the first component is about sending data from a Prometheus instance to object storage and there are two possibilities here - one is using sidecar (pull model) and other is using receiver (push model). So the user has to choose one of two options based on needs and the limitations in underlying infrastructure. The pros and cons of both models are discussed in this document We will discuss both of these in detail in their respective sections.

Sidecar

The Thanos Sidecar runs as a sicecar in the Prometheus pod (The recommended Prometheus version is 2.2.1 or above). It does mainly three things:

Implements the Store API so that Thanos querier can query from Thanos Sidecar
Can persist the metrics to S3 (Compacted or uncompacted)
Can reload prometheus configuration on change (If --web.enable-lifecycle flag is enabled in Prometheus by watching the file --reloader.config-file=CONFIG_FILE)

We should cover the first two and later on add support for reloading the config in Krius development.

One important thing is - the Thanos querier component should be able to reach the endpoint exposed by Thanos sidecar. Which means if the querier is in a different cluster/VM/network - the endpoint of sidecar needs to be exposed somehow outside the source cluster

Receiver

Receiver runs in a separate pod (In same or different cluster) and the Prometheus remote writes to the receiver pod. The API exposed to querier is same Store API.

From Krius's POV - this means

Modifying the Prometheus installation to configure the remote write address
Configuring the Thanos receive and labels etc.

[ ] Do we need to have as many receive components as Prometheus instances?

Querier

Querier simply queries data from multiple sources such as S3 buckets and Sidecar/Receiver and gives you results of query. In first iteration, we simply plumb together the S3 buckets and the sidecar/receiver endpoints! For details check out: https://thanos.io/tip/components/query.md/

Querier has a bunch of strategies that we can support in later versions of Krius, details: https://thanos.io/tip/components/query.md/#query-api-overview

Querier Frontend

Querier frontend is a layer on top of querier to improve read path by using query splitting and caching. The cache can be in memory or memcache. Krius should support installation of Memcached if that is chosen as an option

Compactor


cluster1:       ## This is name of KubeConfig cluster name - as that is what is used to connect
  prometheus:
    name: prom1
    install: true
      name:           # if install is false then this is used to point to an existing install
      namespace:
    remote:
      mode:  receive/sidecar    # Only one mode at a time
        receiveReference: # Reference to one of the reciever if the mode is receive
      s3config: name  # A pointer to one of the s3 configs at end
cluster2:
  prometheus:
    name: prom2
    install: false
    remote:
      mode: sidecar # Since Sidecar is mentioned - there is no need for receiveReference
      s3config: name  # A pointer to one of the s3 configs
cluster3:
  thanos:
    name: thanos-ag1
    querier:
      name: 
      targets:      # https://thanos.io/tip/components/query.md/#store-filtering 
      dedup-enbaled: true/false  # https://thanos.io/tip/components/query.md/#deduplication-enabled 
      autoDownSample:   # https://thanos.io/tip/components/query.md/#auto-downsampling 
      partial_response: true/false    # https://thanos.io/tip/components/query.md/#partial-response-strategy 
    querier-fe:
      name: 
      cacheOption: in-memory/memcached # One of them only
        memcached-options: ## Only if it is defined as memcached above
          key1: value1
    compactor:  # Need to define parameters based on https://thanos.io/tip/components/compact.md/
      name:

  grafana:
    name: 
    setup: true/false
      name:
      namespace:
s3configslist:
  - name: abc
    type: s3
    config:
      bucket: ""
      endpoint: "s3.<region-name>.amazonaws.com"
      access_key: ""
      secret_key: ""
    bucketweb:
      enabled: true
  - name xyz

@hr1sh1kesh @kanuahs need your review on this one. @PrasadG193 Can you please help @YachikaRalhan with YAML syntax and validation - I have done first draft but it is very rough and not exact YAML IMHO!

YachikaRalhan commented 3 years ago

Corrected YAML syntax, validated and added more thanos components to cluster -

cluster1: ## This is name of KubeConfig cluster name - as that is what is used to connect
  prometheus:
    install: true  # if install is false then this is used to point to an existing install name & namespace
    name: prom1  
    namespace: default
    mode: reciever # receiveReference is required in reciever mode
    receiveReference: http://<thanos-receive-container-ip>:10908 # The URL of the endpoint to send samples to.
    objStoreConfig: bucket.yaml  # Storage configuration for uploading data
cluster2:
  prometheus:
    name: prom2 #prometheus URL
    install: false
    mode: sidecar # Since Sidecar is mentioned - there is no need for receiveReference
    objStoreConfig: bucket.yaml  # Storage configuration for uploading data
cluster3:
  thanos:
    name: thanos-ag1
    querier:
      name: testing
      targets: testing    # https://thanos.io/tip/components/query.md/#store-filtering 
      dedup-enbaled: true/false  # https://thanos.io/tip/components/query.md/#deduplication-enabled 
      autoDownSample: testing   # https://thanos.io/tip/components/query.md/#auto-downsampling 
      partial_response: true/false    # https://thanos.io/tip/components/query.md/#partial-response-strategy 
    querier-fe:
      name: testing
      cacheOption: in-memory/memcached # One of them only
      memcached-options: ## Only if it is defined as memcached above
        enabled: true
        key1: value1
    reciever: 
      name: test
      httpPort : <port> # not required
      httpNodePort: <port> # not required
      remoteWritePort: <port> # not required
      remoteWriteNodePort: <port> # not required
    compactor:  # Need to define parameters based on https://thanos.io/tip/components/compact.md/
      name: test
    ruler:
      alertmanagers:
        - http://kube-prometheus-alertmanager.monitoring.svc.cluster.local:9093
      config: |-
        groups:
          - name: "metamonitoring"
            rules:
              - alert: "PrometheusDown"
                expr: absent(up{prometheus="monitoring/kube-prometheus"})

  grafana:
    name: testing
    setup: 
      enabled: true
      name: testing
      namespace: default
objStoreConfigslist:
  - name: abc
    type: s3
    config:
      bucket: ""
      endpoint: "s3.<region-name>.amazonaws.com"
      access_key: ""
      secret_key: ""
    bucketweb:
      enabled: true
  - name: xyz

hr1sh1kesh commented 3 years ago

@YachikaRalhan , @vishal-biyani . So i was thinking Cluster1 and Cluster2 would typically both want either a receiver or a sidecar and it won't be a hybrid like how we have mentioned in the example. Maybe, with that in view we should move the mode above the cluster stanza

I am referring to this line specifically which is within the cluster stanza mode: reciever # receiveReference is required in reciever mode

What do you think?

vishal-biyani commented 3 years ago

@hr1sh1kesh Technically it is possible that one Prom cluster is using sidecar mode and another Prom cluster is using recieve mode no?

hr1sh1kesh commented 3 years ago

True, In theory it is possible. But, then its not really a deployment pattern in my opinion where you have 1 cluster remote writing its metrics vs another cluster just having a sidecar. @vishal-biyani

vishal-biyani commented 3 years ago

Fair enough - so in interest of future flexibility and a possible option - I would say let's keep the mode at Prom config level. Also cluster1 here is a K8S cluster BTW. You are right that currently there is no known deployment patten to that effect

hr1sh1kesh commented 3 years ago

On behalf of @YachikaRalhan

Unmarshaling the current config file according to spec designed is becoming quite complicated in golang as the keys are different for each cluster and would need to access the data dynamically (nested map(string)interface{}). So may be we were doing something wrong in yaml spec
So I updated the config file -
---
clusters:
- name: cluster1
  type: prometheus
  data:
    install: true  
    name: prom1  
    namespace: default
    mode: receiver 
    receiveReference: http://<thanos-receive-container-ip>:10908 
    objStoreConfig: bucket.yaml  
- name: cluster3
  type: thanos
  data:
    name: thanos-ag1
    querier:
      name: testing...

infracloudio / krius

Blueprints and Spec design #22

Sending Data from Prometheus to Object Storage

Sidecar

Receiver

Querier

Querier Frontend

Compactor