grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.79k stars 3.43k forks source link

querier crash on midnight! #7541

Closed supercodershot closed 1 year ago

supercodershot commented 2 years ago

I installed loki stack with helm: grafana/loki-distributed I found querier pod always didn't work in the exactly time(crash on new day),So I must restart querier deploy every day . What happened? 未命名1666924946 未命名1666924976 未命名1666925006

dannykopping commented 2 years ago

@supercodershot hi, please tell us which version of Loki & the chart you are using, and provide your config

DylanGuedes commented 2 years ago

thanks for reporting it.

is the crash OOM or a random panic? could you share more details? I feel like you're seeing the issues at midnight because... you received the queries at midnight :smile: from the error messages, this looks more like a misconfiguration between queriers and the frontend than a bug per se, but lets confirm first

supercodershot commented 2 years ago

@dannykopping Hi, Helm Charts version I used is "loki-distributed-0.56.5" and APP version is 2.6.1. I‘ll paste values.yaml below. @DylanGuedes I have checked OOM and some random panic, I found nothing . and I have made a dashboard with grafana to check logs in loki.So I found loki querier always crash on midnight by checking querier pod's log. I'm so confused on what loki is doing at midnight....some story ? loki helm values :

  image:
    # -- Overrides the Docker registry globally for all images
    registry: null
  # -- Overrides the priorityClassName for all pods
  priorityClassName: null
  # -- configures cluster domain ("cluster.local" by default)
  clusterDomain: "cluster.local"
  # -- configures DNS service name
  dnsService: "kube-dns"
  # -- configures DNS service namespace
  dnsNamespace: "kube-system"

# -- Overrides the chart's name
nameOverride: null

# -- Overrides the chart's computed fullname
fullnameOverride: null

# -- Image pull secrets for Docker images
imagePullSecrets: []

loki:
  # -- If set, these annotations are added to all of the Kubernetes controllers
  # (Deployments, StatefulSets, etc) that this chart launches. Use this to
  # implement something like the "Wave" controller or another controller that
  # is monitoring top level deployment resources.
  annotations: {}
  # Configures the readiness probe for all of the Loki pods
  readinessProbe:
    httpGet:
      path: /ready
      port: http
    initialDelaySeconds: 30
    timeoutSeconds: 1
  livenessProbe:
    httpGet:
      path: /ready
      port: http
    initialDelaySeconds: 300
  image:
    # -- The Docker registry
    registry: docker.io
    # -- Docker image repository
    repository: grafana/loki
    # -- Overrides the image tag whose default is the chart's appVersion
    tag: null
    # -- Docker image pull policy
    pullPolicy: IfNotPresent
  # -- Common labels for all pods
  podLabels: {}
  # -- Common annotations for all pods
  podAnnotations: {}
  # -- Common command override for all pods (except gateway)
  command: null
  # -- The number of old ReplicaSets to retain to allow rollback
  revisionHistoryLimit: 10
  # -- The SecurityContext for Loki pods
  podSecurityContext:
    fsGroup: 10001
    runAsGroup: 10001
    runAsNonRoot: true
    runAsUser: 10001
  # -- The SecurityContext for Loki containers
  containerSecurityContext:
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
  # -- Specify an existing secret containing loki configuration. If non-empty, overrides `loki.config`
  existingSecretForConfig: ""
  # -- Adds the appProtocol field to the memberlist service. This allows memberlist to work with istio protocol selection. Ex: "http" or "tcp"
  appProtocol: ""
  # -- Common annotations for all loki services
  serviceAnnotations: {}
  # -- Config file contents for Loki
  # @default -- See values.yaml
  config: |
    auth_enabled: false

    server:
      http_listen_port: 3100

    distributor:
      ring:
        kvstore:
          store: memberlist

    memberlist:
      join_members:
        - {{ include "loki.fullname" . }}-memberlist

    ingester:
      lifecycler:
        ring:
          kvstore:
            store: memberlist
          replication_factor: 1
      chunk_idle_period: 30m
      chunk_block_size: 262144
      chunk_encoding: snappy
      chunk_retain_period: 1m
      max_transfer_retries: 0
      wal:
        dir: /var/loki/wal

    limits_config:
      enforce_metric_name: false
      reject_old_samples: true
      reject_old_samples_max_age: 168h
      max_cache_freshness_per_query: 10m
      split_queries_by_interval: 15m

    {{- if .Values.loki.schemaConfig}}
    schema_config:
    {{- toYaml .Values.loki.schemaConfig | nindent 2}}
    {{- end}}
    {{- if .Values.loki.storageConfig}}
    storage_config:
    {{- if .Values.indexGateway.enabled}}
    {{- $indexGatewayClient := dict "server_address" (printf "dns:///%s:9095" (include "loki.indexGatewayFullname" .)) }}
    {{- $_ := set .Values.loki.storageConfig.boltdb_shipper "index_gateway_client" $indexGatewayClient }}
    {{- end}}
    {{- toYaml .Values.loki.storageConfig | nindent 2}}
    {{- end}}

    chunk_store_config:
      max_look_back_period: 0s

    table_manager:
      retention_deletes_enabled: false
      retention_period: 0s

    query_range:
      align_queries_with_step: true
      max_retries: 5
      cache_results: true
      results_cache:
        cache:
          enable_fifocache: true
          fifocache:
            max_size_items: 1024
            ttl: 24h

    frontend_worker:
      {{- if .Values.queryScheduler.enabled }}
      scheduler_address: {{ include "loki.querySchedulerFullname" . }}:9095
      {{- else }}
      frontend_address: {{ include "loki.queryFrontendFullname" . }}:9095
      {{- end }}

    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
      {{- if .Values.queryScheduler.enabled }}
      scheduler_address: {{ include "loki.querySchedulerFullname" . }}:9095
      {{- end }}
      tail_proxy_url: http://{{ include "loki.querierFullname" . }}:3100

    compactor:
      shared_store: filesystem

    ruler:
      storage:
        type: local
        local:
          directory: /etc/loki/rules
      ring:
        kvstore:
          store: memberlist
      rule_path: /tmp/loki/scratch
      alertmanager_url: http://alertmanager-main.monitoring:9093
      external_url: https://alertmanager.xx

  # -- Check https://grafana.com/docs/loki/latest/configuration/#schema_config for more info on how to configure schemas
  schemaConfig:
    configs:
    - from: 2020-09-07
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: loki_index_
        period: 24h

  # -- Check https://grafana.com/docs/loki/latest/configuration/#storage_config for more info on how to configure storages
  storageConfig:
    boltdb_shipper:
      shared_store: s3
      active_index_directory: /var/loki/index
      cache_location: /var/loki/cache
      cache_ttl: 168h
        # filesystem:
        #   directory: /var/loki/chunks
    aws:
      s3: http://admin:admin@secrets.com.:9000/cosee-pro
      s3forcepathstyle: true
# -- Uncomment to configure each storage individually
#   azure: {}
#   gcs: {}
#   s3: {}
#   boltdb: {}

  # -- Structured loki configuration, takes precedence over `loki.config`, `loki.schemaConfig`, `loki.storageConfig`
  structuredConfig: {}
serviceAccount:
  # -- Specifies whether a ServiceAccount should be created
  create: true
  # -- The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name: null
  # -- Image pull secrets for the service account
  imagePullSecrets: []
  # -- Annotations for the service account
  annotations: {}
  # -- Set this toggle to false to opt out of automounting API credentials for the service account
  automountServiceAccountToken: true

# RBAC configuration
rbac:
  # -- If pspEnabled true, a PodSecurityPolicy is created for K8s that use psp.
  pspEnabled: false
  # -- For OpenShift set pspEnabled to 'false' and sccEnabled to 'true' to use the SecurityContextConstraints.
  sccEnabled: false

# ServiceMonitor configuration
serviceMonitor:
  # -- If enabled, ServiceMonitor resources for Prometheus Operator are created
  enabled: false
  # -- Alternative namespace for ServiceMonitor resources
  namespace: null
  # -- Namespace selector for ServiceMonitor resources
  namespaceSelector: {}
  # -- ServiceMonitor annotations
  annotations: {}
  # -- Additional ServiceMonitor labels
  labels: {}
  # -- ServiceMonitor scrape interval
  interval: null
  # -- ServiceMonitor scrape timeout in Go duration format (e.g. 15s)
  scrapeTimeout: null
  # -- ServiceMonitor relabel configs to apply to samples before scraping
  # https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
  relabelings: []
  # -- ServiceMonitor metric relabel configs to apply to samples before ingestion
  # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint
  metricRelabelings: []
  # --ServiceMonitor will add labels from the service to the Prometheus metric
  # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#servicemonitorspec
  targetLabels: []
  # -- ServiceMonitor will use http by default, but you can pick https as well
  scheme: http
  # -- ServiceMonitor will use these tlsConfig settings to make the health check requests
  tlsConfig: null

# Rules for the Prometheus Operator
prometheusRule:
  # -- If enabled, a PrometheusRule resource for Prometheus Operator is created
  enabled: false
  # -- Alternative namespace for the PrometheusRule resource
  namespace: null
  # -- PrometheusRule annotations
  annotations: {}
  # -- Additional PrometheusRule labels
  labels: {}
  # -- Contents of Prometheus rules file
  groups: []
  # - name: loki-rules
  #   rules:
  #     - record: job:loki_request_duration_seconds_bucket:sum_rate
  #       expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job)
  #     - record: job_route:loki_request_duration_seconds_bucket:sum_rate
  #       expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route)
  #     - record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
  #       expr: sum(rate(container_cpu_usage_seconds_total[1m])) by (node, namespace, pod, container)

# Configuration for the ingester
ingester:
  # -- Kind of deployment [StatefulSet/Deployment]
  kind: StatefulSet
  # -- Number of replicas for the ingester
  replicas: 1
  image:
    # -- The Docker registry for the ingester image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the ingester image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the ingester image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for ingester pods
  priorityClassName: null
  # -- Labels for ingester pods
  podLabels: {}
  # -- Annotations for ingester pods
  podAnnotations: {}
  # -- Labels for ingestor service
  serviceLabels: {}
  # -- Additional CLI args for the ingester
  extraArgs: []
  # -- Environment variables to add to the ingester pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the ingester pods
  extraEnvFrom: []
  # -- Volume mounts to add to the ingester pods
  extraVolumeMounts: []
  # -- Volumes to add to the ingester pods
  extraVolumes: []
  # -- Resource requests and limits for the ingester
  resources: {}
  # -- Containers to add to the ingester pods
  extraContainers: []
  # -- Grace period to allow the ingester to shutdown before it is killed. Especially for the ingestor,
  # this must be increased. It must be long enough so ingesters can be gracefully shutdown flushing/transferring
  # all data and to successfully leave the member ring on shutdown.
  terminationGracePeriodSeconds: 300
  # -- topologySpread for ingester pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Defaults to allow skew no more then 1 node per AZ
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: failure-domain.beta.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "loki.ingesterSelectorLabels" . | nindent 6 }}
  # -- Affinity for ingester pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.ingesterSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.ingesterSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for ingester pods
  nodeSelector: {}
  # -- Tolerations for ingester pods
  tolerations: []
  # -- readiness probe settings for ingester pods. If empty, use `loki.readinessProbe`
  readinessProbe: {}
  # -- liveness probe settings for ingester pods. If empty use `loki.livenessProbe`
  livenessProbe: {}
  persistence:
    # -- Enable creating PVCs which is required when using boltdb-shipper
    enabled: false
    # -- Use emptyDir with ramdisk for storage. **Please note that all data in ingester will be lost on pod restart**
    inMemory: false
    # -- Size of persistent or memory disk
    size: 10Gi
    # -- Storage class to be used.
    # If defined, storageClassName: <storageClass>.
    # If set to "-", storageClassName: "", which disables dynamic provisioning.
    # If empty or set to null, no storageClassName spec is
    # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
    storageClass: null
  # -- Adds the appProtocol field to the ingester service. This allows ingester to work with istio protocol selection.
  appProtocol:
    # -- Set the optional grpc service protocol. Ex: "grpc", "http2" or "https"
    grpc: ""

# Configuration for the distributor
distributor:
  # -- Number of replicas for the distributor
  replicas: 1
  autoscaling:
    # -- Enable autoscaling for the distributor
    enabled: false
    # -- Minimum autoscaling replicas for the distributor
    minReplicas: 1
    # -- Maximum autoscaling replicas for the distributor
    maxReplicas: 3
    # -- Target CPU utilisation percentage for the distributor
    targetCPUUtilizationPercentage: 60
    # -- Target memory utilisation percentage for the distributor
    targetMemoryUtilizationPercentage:
  image:
    # -- The Docker registry for the distributor image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the distributor image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the distributor image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for distributor pods
  priorityClassName: null
  # -- Labels for distributor pods
  podLabels: {}
  # -- Annotations for distributor pods
  podAnnotations: {}
  # -- Labels for distributor service
  serviceLabels: {}
  # -- Additional CLI args for the distributor
  extraArgs: []
  # -- Environment variables to add to the distributor pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the distributor pods
  extraEnvFrom: []
  # -- Volume mounts to add to the distributor pods
  extraVolumeMounts: []
  # -- Volumes to add to the distributor pods
  extraVolumes: []
  # -- Resource requests and limits for the distributor
  resources: {}
  # -- Containers to add to the distributor pods
  extraContainers: []
  # -- Grace period to allow the distributor to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for distributor pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.distributorSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.distributorSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for distributor pods
  nodeSelector: {}
  # -- Tolerations for distributor pods
  tolerations: []
    # -- Adds the appProtocol field to the distributor service. This allows distributor to work with istio protocol selection.
  appProtocol:
    # -- Set the optional grpc service protocol. Ex: "grpc", "http2" or "https"
    grpc: ""

# Configuration for the querier
querier:
  # -- Number of replicas for the querier
  replicas: 1
  autoscaling:
    # -- Enable autoscaling for the querier, this is only used if `queryIndex.enabled: true`
    enabled: false
    # -- Minimum autoscaling replicas for the querier
    minReplicas: 1
    # -- Maximum autoscaling replicas for the querier
    maxReplicas: 3
    # -- Target CPU utilisation percentage for the querier
    targetCPUUtilizationPercentage: 60
    # -- Target memory utilisation percentage for the querier
    targetMemoryUtilizationPercentage:
  image:
    # -- The Docker registry for the querier image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the querier image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the querier image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for querier pods
  priorityClassName: null
  # -- Labels for querier pods
  podLabels: {}
  # -- Annotations for querier pods
  podAnnotations: {}
  # -- Labels for querier service
  serviceLabels: {}
  # -- Additional CLI args for the querier
  extraArgs: []
  # -- Environment variables to add to the querier pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the querier pods
  extraEnvFrom: []
  # -- Volume mounts to add to the querier pods
  extraVolumeMounts: []
  # -- Volumes to add to the querier pods
  extraVolumes: []
  # -- Resource requests and limits for the querier
  resources: 
   requests:
     cpu: 200m
     memory: 500Mi
   limits: 
    cpu: 1
    memory: 2Gi
  #  -- Containers to add to the querier pods
  extraContainers: []
  # -- Grace period to allow the querier to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for querier pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.querierSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.querierSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for querier pods
  nodeSelector: {}
  # -- Tolerations for querier pods
  tolerations: []
  # -- DNSConfig for querier pods
  dnsConfig: {}
  persistence:
    # -- Enable creating PVCs for the querier cache
    enabled: false
    # -- Size of persistent disk
    size: 10Gi
    # -- Storage class to be used.
    # If defined, storageClassName: <storageClass>.
    # If set to "-", storageClassName: "", which disables dynamic provisioning.
    # If empty or set to null, no storageClassName spec is
    # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
    storageClass: null
  # -- Adds the appProtocol field to the querier service. This allows querier to work with istio protocol selection.
  appProtocol:
    # -- Set the optional grpc service protocol. Ex: "grpc", "http2" or "https"
    grpc: ""

# Configuration for the query-frontend
queryFrontend:
  # -- Number of replicas for the query-frontend
  replicas: 1
  autoscaling:
    # -- Enable autoscaling for the query-frontend
    enabled: false
    # -- Minimum autoscaling replicas for the query-frontend
    minReplicas: 1
    # -- Maximum autoscaling replicas for the query-frontend
    maxReplicas: 3
    # -- Target CPU utilisation percentage for the query-frontend
    targetCPUUtilizationPercentage: 60
    # -- Target memory utilisation percentage for the query-frontend
    targetMemoryUtilizationPercentage:
  image:
    # -- The Docker registry for the query-frontend image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the query-frontend image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the query-frontend image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for query-frontend pods
  priorityClassName: null
  # -- Labels for query-frontend pods
  podLabels: {}
  # -- Annotations for query-frontend pods
  podAnnotations: {}
  # -- Labels for query-frontend service
  serviceLabels: {}
  # -- Additional CLI args for the query-frontend
  extraArgs: []
  # -- Environment variables to add to the query-frontend pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the query-frontend pods
  extraEnvFrom: []
  # -- Volume mounts to add to the query-frontend pods
  extraVolumeMounts: []
  # -- Volumes to add to the query-frontend pods
  extraVolumes: []
  # -- Resource requests and limits for the query-frontend
  resources: {}
  # -- Containers to add to the query-frontend pods
  extraContainers: []
  # -- Grace period to allow the query-frontend to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for query-frontend pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.queryFrontendSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.queryFrontendSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for query-frontend pods
  nodeSelector: {}
  # -- Tolerations for query-frontend pods
  tolerations: []
  # -- Adds the appProtocol field to the queryFrontend service. This allows queryFrontend to work with istio protocol selection.
  appProtocol:
    # -- Set the optional grpc service protocol. Ex: "grpc", "http2" or "https"
    grpc: ""

# Configuration for the query-scheduler
queryScheduler:
  # -- Specifies whether the query-scheduler should be decoupled from the query-frontend
  enabled: false
  # -- Number of replicas for the query-scheduler.
  # It should be lower than `-querier.max-concurrent` to avoid generating back-pressure in queriers;
  # it's also recommended that this value evenly divides the latter
  replicas: 1
  image:
    # -- The Docker registry for the query-scheduler image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the query-scheduler image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the query-scheduler image. Overrides `loki.image.tag`
    tag: null
  # -- The name of the PriorityClass for query-scheduler pods
  priorityClassName: null
  # -- Labels for query-scheduler pods
  podLabels: {}
  # -- Annotations for query-scheduler pods
  podAnnotations: {}
  # -- Labels for query-scheduler service
  serviceLabels: {}
  # -- Additional CLI args for the query-scheduler
  extraArgs: []
  # -- Environment variables to add to the query-scheduler pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the query-scheduler pods
  extraEnvFrom: []
  # -- Volume mounts to add to the query-scheduler pods
  extraVolumeMounts: []
  # -- Volumes to add to the query-scheduler pods
  extraVolumes: []
  # -- Resource requests and limits for the query-scheduler
  resources: {}
  # -- Containers to add to the query-scheduler pods
  extraContainers: []
  # -- Grace period to allow the query-scheduler to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for query-scheduler pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.querySchedulerSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.querySchedulerSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Node selector for query-scheduler pods
  nodeSelector: {}
  # -- Tolerations for query-scheduler pods
  tolerations: []

# Configuration for the table-manager
tableManager:
  # -- Specifies whether the table-manager should be enabled
  enabled: false
  image:
    # -- The Docker registry for the table-manager image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the table-manager image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the table-manager image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for table-manager pods
  priorityClassName: null
  # -- Labels for table-manager pods
  podLabels: {}
  # -- Annotations for table-manager pods
  podAnnotations: {}
  # -- Labels for table-manager service
  serviceLabels: {}
  # -- Additional CLI args for the table-manager
  extraArgs: []
  # -- Environment variables to add to the table-manager pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the table-manager pods
  extraEnvFrom: []
  # -- Volume mounts to add to the table-manager pods
  extraVolumeMounts: []
  # -- Volumes to add to the table-manager pods
  extraVolumes: []
  # -- Resource requests and limits for the table-manager
  resources: {}
  # -- Containers to add to the table-manager pods
  extraContainers: []
  # -- Grace period to allow the table-manager to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for table-manager pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.tableManagerSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.tableManagerSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Node selector for table-manager pods
  nodeSelector: {}
  # -- Tolerations for table-manager pods
  tolerations: []

# Use either this ingress or the gateway, but not both at once.
# If you enable this, make sure to disable the gateway.
# You'll need to supply authn configuration for your ingress controller.
ingress:
  enabled: false
#  ingressClassName: nginx
  annotations: {}
#    nginx.ingress.kubernetes.io/auth-type: basic
#    nginx.ingress.kubernetes.io/auth-secret: loki-distributed-basic-auth
#    nginx.ingress.kubernetes.io/auth-secret-type: auth-map
#    nginx.ingress.kubernetes.io/configuration-snippet: |
#      proxy_set_header X-Scope-OrgID $remote_user;
  paths:
    distributor:
      - /api/prom/push
      - /loki/api/v1/push
    querier:
      - /api/prom/tail
      - /loki/api/v1/tail
    query-frontend:
      - /loki/api
    ruler:
      - /api/prom/rules
      - /loki/api/v1/rules
      - /prometheus/api/v1/rules
      - /prometheus/api/v1/alerts
  hosts:
    - loki.example.com
#  tls:
#    - hosts:
#       - loki.example.com
#      secretName: loki-distributed-tls

# Configuration for the gateway
gateway:
  # -- Specifies whether the gateway should be enabled
  enabled: true
  # -- Number of replicas for the gateway
  replicas: 1
  # -- Enable logging of 2xx and 3xx HTTP requests
  verboseLogging: true
  autoscaling:
    # -- Enable autoscaling for the gateway
    enabled: false
    # -- Minimum autoscaling replicas for the gateway
    minReplicas: 1
    # -- Maximum autoscaling replicas for the gateway
    maxReplicas: 3
    # -- Target CPU utilisation percentage for the gateway
    targetCPUUtilizationPercentage: 60
    # -- Target memory utilisation percentage for the gateway
    targetMemoryUtilizationPercentage:
  # -- See `kubectl explain deployment.spec.strategy` for more,
  # ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
  deploymentStrategy:
    type: RollingUpdate
  image:
    # -- The Docker registry for the gateway image
    registry: docker.io
    # -- The gateway image repository
    repository: nginxinc/nginx-unprivileged
    # -- The gateway image tag
    tag: 1.19-alpine
    # -- The gateway image pull policy
    pullPolicy: IfNotPresent
  # -- The name of the PriorityClass for gateway pods
  priorityClassName: null
  # -- Labels for gateway pods
  podLabels: {}
  # -- Annotations for gateway pods
  podAnnotations: {}
  # -- Additional CLI args for the gateway
  extraArgs: []
  # -- Environment variables to add to the gateway pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the gateway pods
  extraEnvFrom: []
  # -- Volumes to add to the gateway pods
  extraVolumes: []
  # -- Volume mounts to add to the gateway pods
  extraVolumeMounts: []
  # -- The SecurityContext for gateway containers
  podSecurityContext:
    fsGroup: 101
    runAsGroup: 101
    runAsNonRoot: true
    runAsUser: 101
  # -- The SecurityContext for gateway containers
  containerSecurityContext:
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
  # -- Resource requests and limits for the gateway
  resources: {}
  # -- Containers to add to the gateway pods
  extraContainers: []
  # -- Grace period to allow the gateway to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for gateway pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.gatewaySelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.gatewaySelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for gateway pods
  nodeSelector: {}
  # -- Tolerations for gateway pods
  tolerations: []
  # -- DNSConfig for gateway pods
  dnsConfig: {}
  # Gateway service configuration
  service:
    # -- Port of the gateway service
    port: 80
    # -- Type of the gateway service
    type: ClusterIP
    # -- ClusterIP of the gateway service
    clusterIP: null
    # -- Node port if service type is NodePort
    nodePort: null
    # -- Load balancer IPO address if service type is LoadBalancer
    loadBalancerIP: null
    # -- Load balancer allow traffic from CIDR list if service type is LoadBalancer
    loadBalancerSourceRanges: []
    # -- Annotations for the gateway service
    annotations: {}
    # -- Labels for gateway service
    labels: {}
  # Gateway ingress configuration
  ingress:
    # -- Specifies whether an ingress for the gateway should be created
    enabled: false
    # -- Ingress Class Name. MAY be required for Kubernetes versions >= 1.18
    # For example: `ingressClassName: nginx`
    ingressClassName: ''

    # -- Annotations for the gateway ingress
    annotations: {}
    # -- Hosts configuration for the gateway ingress
    hosts:
      - host: gateway.loki.example.com
        paths:
          - path: /
            # -- pathType (e.g. ImplementationSpecific, Prefix, .. etc.) might also be required by some Ingress Controllers
            # pathType: Prefix
    # -- TLS configuration for the gateway ingress
    tls:
      - secretName: loki-gateway-tls
        hosts:
          - gateway.loki.example.com
  # Basic auth configuration
  basicAuth:
    # -- Enables basic authentication for the gateway
    enabled: false
    # -- The basic auth username for the gateway
    username: null
    # -- The basic auth password for the gateway
    password: null
    # -- Uses the specified username and password to compute a htpasswd using Sprig's `htpasswd` function.
    # The value is templated using `tpl`. Override this to use a custom htpasswd, e.g. in case the default causes
    # high CPU load.
    # @default -- See values.yaml
    htpasswd: >-
      {{ htpasswd (required "'gateway.basicAuth.username' is required" .Values.gateway.basicAuth.username) (required "'gateway.basicAuth.password' is required" .Values.gateway.basicAuth.password) }}
    # -- Existing basic auth secret to use. Must contain '.htpasswd'
    existingSecret: null
  # Configures the readiness probe for the gateway
  readinessProbe:
    httpGet:
      path: /
      port: http
    initialDelaySeconds: 15
    timeoutSeconds: 1
  livenessProbe:
    httpGet:
      path: /
      port: http
    initialDelaySeconds: 30
  nginxConfig:
    # -- NGINX log format
    # @default -- See values.yaml
    logFormat: |-
      main '$remote_addr - $remote_user [$time_local]  $status '
              '"$request" $body_bytes_sent "$http_referer" '
              '"$http_user_agent" "$http_x_forwarded_for"';
    # -- Allows appending custom configuration to the server block
    serverSnippet: ""
    # -- Allows appending custom configuration to the http block
    httpSnippet: ""
    # -- Allows overriding the DNS resolver address nginx will use.
    resolver: ""
    # -- Config file contents for Nginx. Passed through the `tpl` function to allow templating
    # @default -- See values.yaml
    file: |
      worker_processes  5;  ## Default: 1
      error_log  /dev/stderr;
      pid        /tmp/nginx.pid;
      worker_rlimit_nofile 8192;

      events {
        worker_connections  4096;  ## Default: 1024
      }

      http {
        client_body_temp_path /tmp/client_temp;
        proxy_temp_path       /tmp/proxy_temp_path;
        fastcgi_temp_path     /tmp/fastcgi_temp;
        uwsgi_temp_path       /tmp/uwsgi_temp;
        scgi_temp_path        /tmp/scgi_temp;

        proxy_http_version    1.1;

        default_type application/octet-stream;
        log_format   {{ .Values.gateway.nginxConfig.logFormat }}

        {{- if .Values.gateway.verboseLogging }}
        access_log   /dev/stderr  main;
        {{- else }}

        map $status $loggable {
          ~^[23]  0;
          default 1;
        }
        access_log   /dev/stderr  main  if=$loggable;
        {{- end }}

        sendfile     on;
        tcp_nopush   on;
        {{- if .Values.gateway.nginxConfig.resolver }}
        resolver {{ .Values.gateway.nginxConfig.resolver }};
        {{- else }}
        resolver {{ .Values.global.dnsService }}.{{ .Values.global.dnsNamespace }}.svc.{{ .Values.global.clusterDomain }};
        {{- end }}

        {{- with .Values.gateway.nginxConfig.httpSnippet }}
        {{ . | nindent 2 }}
        {{- end }}

        server {
          listen             8080;

          {{- if .Values.gateway.basicAuth.enabled }}
          auth_basic           "Loki";
          auth_basic_user_file /etc/nginx/secrets/.htpasswd;
          {{- end }}

          location = / {
            return 200 'OK';
            auth_basic off;
          }

          location = /api/prom/push {
            set $api_prom_push_backend http://{{ include "loki.distributorFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $api_prom_push_backend:3100$request_uri;
            proxy_http_version 1.1;
          }

          location = /api/prom/tail {
            set $api_prom_tail_backend http://{{ include "loki.querierFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $api_prom_tail_backend:3100$request_uri;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_http_version 1.1;
            proxy_read_timeout 300;
            proxy_connect_timeout 300;
            proxy_send_timeout 300;
          }

          # Ruler
          location ~ /prometheus/api/v1/alerts.* {
            proxy_pass       http://{{ include "loki.rulerFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}:3100$request_uri;
          }
          location ~ /prometheus/api/v1/rules.* {
            proxy_pass       http://{{ include "loki.rulerFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}:3100$request_uri;
          }
          location ~ /api/prom/rules.* {
            proxy_pass       http://{{ include "loki.rulerFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}:3100$request_uri;
          }
          location ~ /api/prom/alerts.* {
            proxy_pass       http://{{ include "loki.rulerFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}:3100$request_uri;
          }

          location ~ /api/prom/.* {
            set $api_prom_backend http://{{ include "loki.queryFrontendFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $api_prom_backend:3100$request_uri;
            proxy_http_version 1.1;
            proxy_read_timeout 300;
            proxy_connect_timeout 300;
            proxy_send_timeout 300;
          }

          location = /loki/api/v1/push {
            set $loki_api_v1_push_backend http://{{ include "loki.distributorFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $loki_api_v1_push_backend:3100$request_uri;
            proxy_http_version 1.1;
          }

          location = /loki/api/v1/tail {
            set $loki_api_v1_tail_backend http://{{ include "loki.querierFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $loki_api_v1_tail_backend:3100$request_uri;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_http_version 1.1;
            proxy_read_timeout 300;
            proxy_connect_timeout 300;
            proxy_send_timeout 300;
          }

          location ~ /loki/api/.* {
            set $loki_api_backend http://{{ include "loki.queryFrontendFullname" . }}.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }};
            proxy_pass       $loki_api_backend:3100$request_uri;
            proxy_http_version 1.1;
            proxy_read_timeout 300;
            proxy_connect_timeout 300;
            proxy_send_timeout 300;
          }

          {{- with .Values.gateway.nginxConfig.serverSnippet }}
          {{ . | nindent 4 }}
          {{- end }}
        }
      }

# Configuration for the compactor
compactor:
  # -- Specifies whether compactor should be enabled
  enabled: false
  image:
    # -- The Docker registry for the compactor image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the compactor image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the compactor image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for compactor pods
  priorityClassName: null
  # -- Labels for compactor pods
  podLabels: {}
  # -- Annotations for compactor pods
  podAnnotations: {}
  # -- Specify the compactor affinity
  affinity: {}
  # -- Labels for compactor service
  serviceLabels: {}
  # -- Additional CLI args for the compactor
  extraArgs: []
  # -- Environment variables to add to the compactor pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the compactor pods
  extraEnvFrom: []
  # -- Volume mounts to add to the compactor pods
  extraVolumeMounts: []
  # -- Volumes to add to the compactor pods
  extraVolumes: []
  # -- Resource requests and limits for the compactor
  resources: {}
  # -- Containers to add to the compactor pods
  extraContainers: []
  # -- Grace period to allow the compactor to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Node selector for compactor pods
  nodeSelector: {}
  # -- Tolerations for compactor pods
  tolerations: []
  persistence:
    # -- Enable creating PVCs for the compactor
    enabled: false
    # -- Size of persistent disk
    size: 10Gi
    # -- Storage class to be used.
    # If defined, storageClassName: <storageClass>.
    # If set to "-", storageClassName: "", which disables dynamic provisioning.
    # If empty or set to null, no storageClassName spec is
    # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
    storageClass: null
  serviceAccount:
    create: false
    # -- The name of the ServiceAccount to use for the compactor.
    # If not set and create is true, a name is generated by appending
    # "-compactor" to the common ServiceAccount.
    name: null
    # -- Image pull secrets for the compactor service account
    imagePullSecrets: []
    # -- Annotations for the compactor service account
    annotations: {}
    # -- Set this toggle to false to opt out of automounting API credentials for the service account
    automountServiceAccountToken: true

# Configuration for the ruler
ruler:
  # -- Specifies whether the ruler should be enabled
  enabled: true
  # -- Kind of deployment [StatefulSet/Deployment]
  kind: Deployment
  # -- Number of replicas for the ruler
  replicas: 1
  image:
    # -- The Docker registry for the ruler image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the ruler image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the ruler image. Overrides `loki.image.tag`
    tag: null
  # -- Command to execute instead of defined in Docker image
  command: null
  # -- The name of the PriorityClass for ruler pods
  priorityClassName: null
  # -- Labels for compactor pods
  podLabels: {}
  # -- Annotations for ruler pods
  podAnnotations: {}
  # -- Labels for ruler service
  serviceLabels: {}
  # -- Additional CLI args for the ruler
  extraArgs: []
  # -- Environment variables to add to the ruler pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the ruler pods
  extraEnvFrom: []
  # -- Volume mounts to add to the ruler pods
  extraVolumeMounts: []
  # -- Volumes to add to the ruler pods
  extraVolumes: []
  # -- Resource requests and limits for the ruler
  resources: {}
  # -- Containers to add to the ruler pods
  extraContainers: []
  # -- Grace period to allow the ruler to shutdown before it is killed
  terminationGracePeriodSeconds: 300
  # -- Affinity for ruler pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.rulerSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.rulerSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for ruler pods
  nodeSelector: {}
  # -- Tolerations for ruler pods
  tolerations: []
  # -- DNSConfig for ruler pods
  dnsConfig: {}
  persistence:
    # -- Enable creating PVCs which is required when using recording rules
    enabled: false
    # -- Size of persistent disk
    size: 10Gi
    # -- Storage class to be used.
    # If defined, storageClassName: <storageClass>.
    # If set to "-", storageClassName: "", which disables dynamic provisioning.
    # If empty or set to null, no storageClassName spec is
    # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
    storageClass: null
  # -- Directories containing rules files
  directories:
    cosee:
      log.txt: |
        groups:
          - name: test-undefined
            rules:
              - alert: CoseeSomeError
                expr: |
                  count_over_time ({container!="cos-bus",namespace="cosee"} |= "log" [5m] ) < 0
                for: 1m
                labels:
                  severity: critical
                annotations:
                  summary: Cosee some Error
                  description:  Cosee some error test 
    # tenant_foo:
    #   rules1.txt: |
    #     groups:
    #       - name: should_fire
    #         rules:
    #           - alert: HighPercentageError
    #             expr: |
    #               sum(rate({app="foo", env="production"} |= "error" [5m])) by (job)
    #                 /
    #               sum(rate({app="foo", env="production"}[5m])) by (job)
    #                 > 0.05
    #             for: 10m
    #             labels:
    #               severity: warning
    #             annotations:
    #               summary: High error rate
    #       - name: credentials_leak
    #         rules:
    #           - alert: http-credentials-leaked
    #             annotations:
    #               message: "{{ $labels.job }} is leaking http basic auth credentials."
    #             expr: 'sum by (cluster, job, pod) (count_over_time({namespace="prod"} |~ "http(s?)://(\\w+):(\\w+)@" [5m]) > 0)'
    #             for: 10m
    #             labels:
    #               severity: critical
    #   rules2.txt: |
    #     groups:
    #       - name: example
    #         rules:
    #         - alert: HighThroughputLogStreams
    #           expr: sum by(container) (rate({job=~"loki-dev/.*"}[1m])) > 1000
    #           for: 2m
    # tenant_bar:
    #   rules1.txt: |
    #     groups:
    #       - name: should_fire
    #         rules:
    #           - alert: HighPercentageError
    #             expr: |
    #               sum(rate({app="foo", env="production"} |= "error" [5m])) by (job)
    #                 /
    #               sum(rate({app="foo", env="production"}[5m])) by (job)
    #                 > 0.05
    #             for: 10m
    #             labels:
    #               severity: warning
    #             annotations:
    #               summary: High error rate
    #       - name: credentials_leak
    #         rules:
    #           - alert: http-credentials-leaked
    #             annotations:
    #               message: "{{ $labels.job }} is leaking http basic auth credentials."
    #             expr: 'sum by (cluster, job, pod) (count_over_time({namespace="prod"} |~ "http(s?)://(\\w+):(\\w+)@" [5m]) > 0)'
    #             for: 10m
    #             labels:
    #               severity: critical
    #   rules2.txt: |
    #     groups:
    #       - name: example
    #         rules:
    #         - alert: HighThroughputLogStreams
    #           expr: sum by(container) (rate({job=~"loki-dev/.*"}[1m])) > 1000
    #           for: 2m

# Configuration for the index-gateway
indexGateway:
  # -- Specifies whether the index-gateway should be enabled
  enabled: false
  # -- Number of replicas for the index-gateway
  replicas: 1
  image:
    # -- The Docker registry for the index-gateway image. Overrides `loki.image.registry`
    registry: null
    # -- Docker image repository for the index-gateway image. Overrides `loki.image.repository`
    repository: null
    # -- Docker image tag for the index-gateway image. Overrides `loki.image.tag`
    tag: null
  # -- The name of the PriorityClass for index-gateway pods
  priorityClassName: null
  # -- Labels for index-gateway pods
  podLabels: {}
  # -- Annotations for index-gateway pods
  podAnnotations: {}
  # -- Labels for index-gateway service
  serviceLabels: {}
  # -- Additional CLI args for the index-gateway
  extraArgs: []
  # -- Environment variables to add to the index-gateway pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to the index-gateway pods
  extraEnvFrom: []
  # -- Volume mounts to add to the index-gateway pods
  extraVolumeMounts: []
  # -- Volumes to add to the index-gateway pods
  extraVolumes: []
  # -- Resource requests and limits for the index-gateway
  resources: {}
  # -- Containers to add to the index-gateway pods
  extraContainers: []
  # -- Grace period to allow the index-gateway to shutdown before it is killed.
  terminationGracePeriodSeconds: 300
  # -- Affinity for index-gateway pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.indexGatewaySelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.indexGatewaySelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for index-gateway pods
  nodeSelector: {}
  # -- Tolerations for index-gateway pods
  tolerations: []
  persistence:
    # -- Enable creating PVCs which is required when using boltdb-shipper
    enabled: false
    # -- Use emptyDir with ramdisk for storage. **Please note that all data in indexGateway will be lost on pod restart**
    inMemory: false
    # -- Size of persistent or memory disk
    size: 10Gi
    # -- Storage class to be used.
    # If defined, storageClassName: <storageClass>.
    # If set to "-", storageClassName: "", which disables dynamic provisioning.
    # If empty or set to null, no storageClassName spec is
    # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack).
    storageClass: null

memcached:
  readinessProbe:
    tcpSocket:
      port: http
    initialDelaySeconds: 5
    timeoutSeconds: 1
  livenessProbe:
    tcpSocket:
      port: http
    initialDelaySeconds: 10
  image:
    # -- The Docker registry for the memcached
    registry: docker.io
    # -- Memcached Docker image repository
    repository: memcached
    # -- Memcached Docker image tag
    tag: 1.6.7-alpine
    # -- Memcached Docker image pull policy
    pullPolicy: IfNotPresent
  # -- Labels for memcached pods
  podLabels: {}
  # -- The SecurityContext for memcached pods
  podSecurityContext:
    fsGroup: 11211
    runAsGroup: 11211
    runAsNonRoot: true
    runAsUser: 11211
  # -- The SecurityContext for memcached containers
  containerSecurityContext:
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false
  # -- Common annotations for all memcached services
  serviceAnnotations: {}
  # -- Adds the appProtocol field to the memcached services. This allows memcached to work with istio protocol selection. Ex: "http" or "tcp"
  appProtocol: ""

memcachedExporter:
  # -- Specifies whether the Memcached Exporter should be enabled
  enabled: false
  image:
    # -- The Docker registry for the Memcached Exporter
    registry: docker.io
    # -- Memcached Exporter Docker image repository
    repository: prom/memcached-exporter
    # -- Memcached Exporter Docker image tag
    tag: v0.6.0
    # -- Memcached Exporter Docker image pull policy
    pullPolicy: IfNotPresent
  # -- Labels for memcached-exporter pods
  podLabels: {}
  # -- Memcached Exporter resource requests and limits
  resources: {}
  # -- The SecurityContext for memcachedExporter containers
  containerSecurityContext:
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL
    allowPrivilegeEscalation: false

memcachedChunks:
  # -- Specifies whether the Memcached chunks cache should be enabled
  enabled: false
  # -- Number of replicas for memcached-chunks
  replicas: 1
  # -- The name of the PriorityClass for memcached-chunks pods
  priorityClassName: null
  # -- Labels for memcached-chunks pods
  podLabels: {}
  # -- Annotations for memcached-chunks pods
  podAnnotations: {}
  # -- Labels for memcached-chunks service
  serviceLabels: {}
  # -- Additional CLI args for memcached-chunks
  extraArgs:
    - -I 32m
  # -- Environment variables to add to memcached-chunks pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to memcached-chunks pods
  extraEnvFrom: []
  # -- Resource requests and limits for memcached-chunks
  resources: {}
  # -- Containers to add to the memcached-chunks pods
  extraContainers: []
  # -- Grace period to allow memcached-chunks to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for memcached-chunks pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.memcachedChunksSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.memcachedChunksSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for memcached-chunks pods
  nodeSelector: {}
  # -- Tolerations for memcached-chunks pods
  tolerations: []

memcachedFrontend:
  # -- Specifies whether the Memcached frontend cache should be enabled
  enabled: false
  # -- Number of replicas for memcached-frontend
  replicas: 1
  # -- The name of the PriorityClass for memcached-frontend pods
  priorityClassName: null
  # -- Labels for memcached-frontend pods
  podLabels: {}
  # -- Annotations for memcached-frontend pods
  podAnnotations: {}
  # -- Labels for memcached-frontend service
  serviceLabels: {}
  # -- Additional CLI args for memcached-frontend
  extraArgs:
    - -I 32m
  # -- Environment variables to add to memcached-frontend pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to memcached-frontend pods
  extraEnvFrom: []
  # -- Resource requests and limits for memcached-frontend
  resources: {}
  # -- Containers to add to the memcached-frontend pods
  extraContainers: []
  # -- Grace period to allow memcached-frontend to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for memcached-frontend pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.memcachedFrontendSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.memcachedFrontendSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Node selector for memcached-frontend pods
  nodeSelector: {}
  # -- Tolerations for memcached-frontend pods
  tolerations: []

memcachedIndexQueries:
  # -- Specifies whether the Memcached index queries cache should be enabled
  enabled: false
  # -- Number of replicas for memcached-index-queries
  replicas: 1
  # -- The name of the PriorityClass for memcached-index-queries pods
  priorityClassName: null
  # -- Labels for memcached-index-queries pods
  podLabels: {}
  # -- Annotations for memcached-index-queries pods
  podAnnotations: {}
  # -- Labels for memcached-index-queries service
  serviceLabels: {}
  # -- Additional CLI args for memcached-index-queries
  extraArgs:
    - -I 32m
  # -- Environment variables to add to memcached-index-queries pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to memcached-index-queries pods
  extraEnvFrom: []
  # -- Resource requests and limits for memcached-index-queries
  resources: {}
  # -- Containers to add to the memcached-index-queries pods
  extraContainers: []
  # -- Grace period to allow memcached-index-queries to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for memcached-index-queries pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.memcachedIndexQueriesSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.memcachedIndexQueriesSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for memcached-index-queries pods
  nodeSelector: {}
  # -- Tolerations for memcached-index-queries pods
  tolerations: []

memcachedIndexWrites:
  # -- Specifies whether the Memcached index writes cache should be enabled
  enabled: false
  # -- Number of replicas for memcached-index-writes
  replicas: 1
  # -- The name of the PriorityClass for memcached-index-writes pods
  priorityClassName: null
  # -- Labels for memcached-index-writes pods
  podLabels: {}
  # -- Annotations for memcached-index-writes pods
  podAnnotations: {}
  # -- Labels for memcached-index-writes service
  serviceLabels: {}
  # -- Additional CLI args for memcached-index-writes
  extraArgs:
    - -I 32m
  # -- Environment variables to add to memcached-index-writes pods
  extraEnv: []
  # -- Environment variables from secrets or configmaps to add to memcached-index-writes pods
  extraEnvFrom: []
  # -- Resource requests and limits for memcached-index-writes
  resources: {}
  # -- Containers to add to the memcached-index-writes pods
  extraContainers: []
  # -- Grace period to allow memcached-index-writes to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  # -- Affinity for memcached-index-writes pods. Passed through `tpl` and, thus, to be configured as string
  # @default -- Hard node and soft zone anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "loki.memcachedIndexWritesSelectorLabels" . | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "loki.memcachedIndexWritesSelectorLabels" . | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  # -- Pod Disruption Budget maxUnavailable
  maxUnavailable: null
  # -- Node selector for memcached-index-writes pods
  nodeSelector: {}
  # -- Tolerations for memcached-index-writes pods
  tolerations: []

networkPolicy:
  # -- Specifies whether Network Policies should be created
  enabled: false
  metrics:
    # -- Specifies the Pods which are allowed to access the metrics port.
    # As this is cross-namespace communication, you also need the namespaceSelector.
    podSelector: {}
    # -- Specifies the namespaces which are allowed to access the metrics port
    namespaceSelector: {}
    # -- Specifies specific network CIDRs which are allowed to access the metrics port.
    # In case you use namespaceSelector, you also have to specify your kubelet networks here.
    # The metrics ports are also used for probes.
    cidrs: []
  ingress:
    # -- Specifies the Pods which are allowed to access the http port.
    # As this is cross-namespace communication, you also need the namespaceSelector.
    podSelector: {}
    # -- Specifies the namespaces which are allowed to access the http port
    namespaceSelector: {}
  alertmanager:
    # -- Specify the alertmanager port used for alerting
    port: 9093
    # -- Specifies the alertmanager Pods.
    # As this is cross-namespace communication, you also need the namespaceSelector.
    podSelector: {}
    # -- Specifies the namespace the alertmanager is running in
    namespaceSelector: {}
  externalStorage:
    # -- Specify the port used for external storage, e.g. AWS S3
    ports: []
    # -- Specifies specific network CIDRs you want to limit access to
    cidrs: []
  discovery:
    # -- Specify the port used for discovery
    port: null
    # -- Specifies the Pods labels used for discovery.
    # As this is cross-namespace communication, you also need the namespaceSelector.
    podSelector: {}
    # -- Specifies the namespace the discovery Pods are running in
    namespaceSelector: {}
supercodershot commented 2 years ago

I checked loki stack pods and minio's time zone ,they are all the same : Mon, 31 Oct 2022 01:48:16 +0000

broferek commented 1 year ago

I have the same problem. Every day at midnight, it crashes. I run Loki 2.6.1 in a Podman container.

Both Loki and Minio containers are set up in UTC time zone as well. However, the server on which the containers run was set to CET. I have just tried to change the timezone of the server to UTC.

I'll let you know in the coming days if it improves the situation.

DylanGuedes commented 1 year ago

check compactor logs too

supercodershot commented 1 year ago

@DylanGuedes just these pods ,which logs should check? 未命名1668071537

DylanGuedes commented 1 year ago

ohh you're not running a compactor (compactor: enabled: false on the helmchart), so no worries. It could only be the culprit if you were running one.

supercodershot commented 1 year ago

@DylanGuedes Thanks , should I enable compactor? or plus replicas to 2 or more ?

supercodershot commented 1 year ago

@broferek How is it goging ?

broferek commented 1 year ago

@supercodershot it didn't crash this night. I'll review it again on Monday and let you know.

broferek commented 1 year ago

It crashed again this night so unfortunately it is not the solution to our problem.

DylanGuedes commented 1 year ago

It crashed again this night so unfortunately it is not the solution to our problem.

didn't you find anything useful in the logs? even running with log-level=debug? does all queriers crash or just a few of them? did they crash for running OOM or for a panic? do they crash at midnight UTC or in your local timezone?

dannykopping commented 1 year ago

Closing this issue because it's been over 2 months with no response. We can reopen if this is still an issue for you @broferek