thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.12k stars 2.1k forks source link

saturation of pvc thanos receive #6759

Open BalighRezgui opened 1 year ago

BalighRezgui commented 1 year ago

Hello !

I have a prometheus configured to send the metrics in thanos receive is stored in an s3 bucket with thanos store, everything works well I have metrics but the problem I have a saturation of PVC of thanos receive despite the fact that the metrics are stored in bucket s3.

the saturation path on the thanos receiver pods :

/var/thanos/receive/57236109-c0e3-4b53-8700-30232dbb073d

Do you have a solution for this?

What you expected to happen:

the metrics stored in the bucket without PVC saturation of thanos receive

Full logs to relevant components:

    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app.kubernetes.io/name
                  operator: In
                  values:
                  - thanos-receive
                - key: app.kubernetes.io/instance
                  operator: In
                  values:
                  - observatorium-xyz
              namespaces:
              - observatorium
              topologyKey: kubernetes.io/hostname
            weight: 100
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app.kubernetes.io/name
                  operator: In
                  values:
                  - thanos-receive
                - key: app.kubernetes.io/instance
                  operator: In
                  values:
                  - observatorium-xyz
              namespaces:
              - observatorium
              topologyKey: topology.kubernetes.io/zone
            weight: 100
      containers:
      - args:
        - receive
        - --log.level=info
        - --log.format=logfmt
        - --grpc-address=0.0.0.0:10901
        - --http-address=0.0.0.0:10902
        - --remote-write.address=0.0.0.0:19291
        - --receive.replication-factor=1
        - --tsdb.path=/var/thanos/receive
        - --tsdb.retention=4d
        - --label=replica="$(NAME)"
        - --label=receive="true"
        - --objstore.config=$(OBJSTORE_CONFIG)
        - --receive.local-endpoint=$(NAME).observatorium-xyz-thanos-receive-default.$(NAMESPACE).svc.cluster.local:10901
        - --receive.hashrings-file=/var/lib/thanos-receive/hashrings.json
        env:
        - name: NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: HOST_IP_ADDRESS
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: OBJSTORE_CONFIG
          valueFrom:
            secretKeyRef:
              key: thanos.yaml
              name: thanos-objectstorage
        image: quay.io/thanos/thanos:v0.24.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 8
          httpGet:
            path: /-/healthy
            port: 10902
            scheme: HTTP
          periodSeconds: 30
        name: thanos-receive
        ports:
        - containerPort: 10901
          name: grpc
        - containerPort: 10902
          name: http
        - containerPort: 19291
          name: remote-write
        readinessProbe:
          failureThreshold: 20
          httpGet:
            path: /-/ready
            port: 10902
            scheme: HTTP
          periodSeconds: 5
        resources: {}
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /var/thanos/receive
          name: data
          readOnly: false
        - mountPath: /var/lib/thanos-receive
          name: hashring-config
      nodeSelector:
        kubernetes.io/os: linux
      securityContext: {}
      serviceAccountName: observatorium-xyz-thanos-receive
      terminationGracePeriodSeconds: 900
      volumes:
      - configMap:
          name: observatorium-xyz-thanos-receive-controller-tenants-generated
        name: hashring-config
  volumeClaimTemplates:
  - metadata:
      labels:
        app.kubernetes.io/component: database-write-hashring
        app.kubernetes.io/instance: observatorium-xyz
        app.kubernetes.io/name: thanos-receive
        app.kubernetes.io/part-of: observatorium
        controller.receive.thanos.io/hashring: default
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 50Gi
    [
      {
        "hashring": "default",
        "tenants": [

        ]
      }
    ]
    spec:
      containers:
        - name: prometheus
          image: quay.io/prometheus/prometheus:latest
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
            - "--storage.tsdb.path=/prometheus/"
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: prometheus-config-volume
              mountPath: /etc/prometheus/
            - name: monitoring-msp-ca
              mountPath: /etc/prometheus/ca
            - name: prometheus-client-certs
              mountPath: /etc/prometheus/certs
            - name: prometheus-storage-volume
              mountPath: /prometheus/
    remote_write:
    - url: https://observatorium-xyz-observatorium-api.msp-monitoring-stack.svc.cluster.local:8080/api/metrics/v1/default/api/v1/receive
      remote_timeout: 30s
      tls_config:
        ca_file: /etc/prometheus/ca/tls.crt
        cert_file: /etc/prometheus/certs/tls.crt
        key_file: /etc/prometheus/certs/tls.key
MichaHoffmann commented 1 year ago

How is Prometheus receiver configured? You can probably lower tsdb retention to save on storage space