open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.21k stars 440 forks source link

Unable to configure servicemonitorselector under prometheusCR under targetallocator section #1907

Closed krimeshshah closed 7 months ago

krimeshshah commented 1 year ago

Component(s)

receiver/prometheus

Describe the issue you're reporting

Since scraping prometheus CR servicemonitor endpoints without any selection consume too much memory and drops the data , we decide to only scrape selected servicemonitor using serviceMonitorSelector option provided by opentelemtry crd from opentelemetry operator which we are using here in kueb-otel-stack - https://github.com/lightstep/otel-collector-charts/blob/main/charts/kube-otel-stack/values.yaml#L146

I read and found out that we can specify the and limit the number of servicemonitor that we want to scrape for metrics using "serviceMonitorSelector" (https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-operator/crds/crd-opentelemetrycollector.yaml#L2039) which can reduce the memory consumption i believe as it will reduce the no of scraping endpoints. As described in opentelemetrycollector crd, this is parameter is of type string but at the same time it is written that it is key value pair. And as per the kube-prometheus-stack i should configue like - https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml#L2975 But when the same way when i try to configure multiple servicemonitor with matchLabel and its key value pair it kube-otel-stack deployment fails. Hence i tried to configure as below ## Default collector for metrics (includes infrastructure metrics) metricsCollector: name: metrics clusterName: dev image: otel/opentelemetry-collector-contrib:0.73.0 enabled: true mode: statefulset replicas: 6 targetallocator: enabled: true allocationStrategy: "consistent-hashing" filterStrategy: relabel-config replicas: 2 image: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.73.0 prometheusCR: enabled: true serviceMonitorSelector: matchLabels: "lightstep-metrics-collector"

But the things is , i can only specify one servicemonitor this way, how can i configure multiple servicemonitor?

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

swiatekm commented 1 year ago

I believe this issue belongs in https://github.com/open-telemetry/opentelemetry-operator. The prometheus receiver doesn't interact with ServiceMonitors in any way - in the otel ecosystem, the Target Allocator does this.

krimeshshah commented 1 year ago

@swiatekm-sumo Technically yes, the target allocator is deployed when opentelemetry collector CR is created and when we make targetallocator enabled. And in this case we are making TA enabled: true in kube-otel-stack and PrometheusCR is part of TA in kubeotelstack. https://github.com/lightstep/otel-collector-charts/blob/main/charts/kube-otel-stack/values.yaml#L141

Also here is my values.yam file


namespaceOverride: ""
nameOverride: ""
operatorNamespaceOverride: ""

## Auto-Instrumentation resource to be installed in the cluster
## Can be used by setting the following:
##  Java: instrumentation.opentelemetry.io/inject-java: "true"
##  NodeJS: instrumentation.opentelemetry.io/inject-nodejs: "true"
##  Python: instrumentation.opentelemetry.io/inject-python: "true"
##  DotNet: instrumentation.opentelemetry.io/inject-dotnet: "true"
##  OpenTelemetry SDK environment variables only: instrumentation.opentelemetry.io/inject-sdk: "true"
autoinstrumentation:
  enabled: false
  ## The collector name to send traces to
  collectorTarget: traces
  propagators:
    - tracecontext
    - baggage
    - b3

  ## Sampler defines the OTEL sampler behavior to be used. Example:
  ##
  ## sampler:
  ##   type: parentbased_traceidratio
  ##   argument: "0.25"
  ##
  sampler:
    ## The value can be for instance parentbased_always_on, parentbased_always_off, parentbased_traceidratio...
    type: parentbased_traceidratio
    ## The value depends on the sampler type.
    ## For instance for parentbased_traceidratio sampler type it is a number in range [0..1] e.g. 0.25.
    argument: "0.25"

  ## A list of corev1.EnvVars
  env: []

  ## https://github.com/open-telemetry/opentelemetry-specification/blob/v1.8.0/specification/overview.md#resources
  resource: {}

## Custom collectors to be installed to the cluster
## Matches the structure of tracesCollector and metricsCollector
collectors: []

## Default collector for tracing
tracesCollector:
  enabled: false
  name: traces
  clusterName: ""
  image: otel/opentelemetry-collector-contrib:0.73.0
  mode: deployment
  replicas: 1
  resources:
    limits:
      cpu: 250m
      memory: 250Mi
    requests:
      cpu: 250m
      memory: 250Mi
  env:
    - name: LS_TOKEN
      valueFrom:
        secretKeyRef:
          key: LS_TOKEN
          name: otel-collector-secret
  config:
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: "0.0.0.0:4317"
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 30
      resourcedetection/env:
        detectors: [env]
        timeout: 2s
        override: false
      batch:
        send_batch_size: 1000
        timeout: 1s
        send_batch_max_size: 1500
      k8sattributes:
        passthrough: false
        pod_association:
          - sources:
              - from: resource_attribute
                name: k8s.pod.name
        extract:
          metadata:
            - k8s.namespace.name
            - k8s.pod.name
            - k8s.pod.uid
            - k8s.node.name
            - k8s.pod.start_time
            - k8s.deployment.name
            - k8s.replicaset.name
            - k8s.replicaset.uid
            - k8s.daemonset.name
            - k8s.daemonset.uid
            - k8s.job.name
            - k8s.job.uid
            - k8s.cronjob.name
            - k8s.statefulset.name
            - k8s.statefulset.uid
            - container.image.tag
            - container.image.name
      resource:
        attributes:
        - key: lightstep.helm_chart
          value: kube-otel-stack
          action: insert
        - key: collector.name
          value: "${KUBE_POD_NAME}"
          action: insert

    exporters:
      otlp:
        endpoint: ingest.lightstep.com:443
        headers:
          "lightstep-access-token": "${LS_TOKEN}"

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, resource, resourcedetection/env, k8sattributes, batch]
          exporters: [otlp]

## Default collector for metrics (includes infrastructure metrics)
metricsCollector:
  name: metrics
  clusterName: dev
  image: otel/opentelemetry-collector-contrib:0.73.0
  enabled: true
  mode: statefulset
  replicas: 6
  targetallocator:
    enabled: true
    allocationStrategy: "consistent-hashing"
    filterStrategy: relabel-config
    replicas: 2
    image: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.73.0
    prometheusCR:
      enabled: true
      serviceMonitorSelector:
        matchLabels:
          "lightstep-metrics-collector"
         #"translation-service-monitor"
          # "document-translation"
          # "nmt-engines-monitor"
          # "document-translation-servicemonitor"
          # "lightstep-operator"
          # "lightstep-metrics-collector"
      #   matchLabels:
      #     "app": "kube-prometheus-stack-prometheus"
      # #     "release: monitoring-dev-lp"
      # #     "k8s-app: machine-translation"
      # #     "name: translation-service-monitor"
      #     "app: document-translation"
      #     "app.kubernetes.io/component: document-translation"
      #     "app.kubernetes.io/name: document-translation"
      #     "component: engine"
      #     "k8s-app: machine-translation"

  # No need for a scrape config when using prometheusCRs
  scrape_configs_file: "scrape_configs.yaml"
  resources:
    limits:
      cpu: 250m
      memory: 500Mi
    requests:
      cpu: 250m
      memory: 500Mi
  env:
    - name: LS_TOKEN
      valueFrom:
        secretKeyRef:
          key: LS_TOKEN
          name: otel-collector-secret
  config:
    extensions:
      health_check:
        endpoint: "0.0.0.0:13133"
        path: "/"
        check_collector_pipeline:
          enabled: false
          interval: "5m"
          exporter_failure_threshold: 5
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: "0.0.0.0:4317"
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 30
      metricstransform/k8sservicename:
        transforms:
          - include: kube_service_info
            match_type: strict
            action: update
            operations:
              - action: update_label
                label: service
                new_label: k8s.service.name
      resourcedetection/env:
        detectors: [env]
        timeout: 2s
        override: false
      k8sattributes:
        passthrough: false
        pod_association:
          - sources:
              - from: resource_attribute
                name: k8s.pod.name
        extract:
          metadata:
            - k8s.namespace.name
            - k8s.pod.name
            - k8s.pod.uid
            - k8s.node.name
            - k8s.pod.start_time
            - k8s.deployment.name
            - k8s.replicaset.name
            - k8s.replicaset.uid
            - k8s.daemonset.name
            - k8s.daemonset.uid
            - k8s.job.name
            - k8s.job.uid
            - k8s.cronjob.name
            - k8s.statefulset.name
            - k8s.statefulset.uid
            - container.image.tag
            - container.image.name
      batch:
        send_batch_size: 1000
        timeout: 1s
        send_batch_max_size: 1500
      resource:
        attributes:
        - key: lightstep.helm_chart
          value: kube-otel-stack
          action: insert
        - key: collector.name
          value: "${KUBE_POD_NAME}"
          action: insert
        - key: job
          from_attribute: service.name
          action: insert

    exporters:
      otlp:
        # endpoint: ingest.lightstep.com:443
          headers:
          #"lightstep-access-token": "${LS_TOKEN}"
            # headers:
        #   "lightstep-access-token": "${LS_TOKEN}"

    service:
      extensions:
        - health_check
      pipelines:
        metrics:
          receivers: [prometheus, otlp]
          processors: [memory_limiter, resource, resourcedetection/env, k8sattributes, metricstransform/k8sservicename, batch]
          exporters: [otlp]

## Component scraping the kube api server
##
kubeApiServer:
  enabled: true
  tlsConfig:
    serverName: kubernetes
    insecureSkipVerify: false
  serviceMonitor:
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""
    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    jobLabel: component
    selector:
      matchLabels:
        component: apiserver
        provider: kubernetes

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings:
      # Drop excessively noisy apiserver buckets.
      - action: drop
        regex: apiserver_request_duration_seconds_bucket;(0.15|0.2|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2|3|3.5|4|4.5|6|7|8|9|15|25|40|50)
        sourceLabels:
          - __name__
          - le
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels:
    #     - __meta_kubernetes_namespace
    #     - __meta_kubernetes_service_name
    #     - __meta_kubernetes_endpoint_port_name
    #   action: keep
    #   regex: default;kubernetes;https
    # - targetLabel: __address__
    #   replacement: kubernetes.default.svc:443

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping the kubelet and kubelet-hosted cAdvisor
##
kubelet:
  enabled: false
  namespace: kube-system

  serviceMonitor:
    enable: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""

    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    ## Enable scraping the kubelet over https. For requirements to enable this see
    ## https://github.com/prometheus-operator/prometheus-operator/issues/926
    ##
    https: true

    ## Enable scraping /metrics/cadvisor from kubelet's service
    ##
    cAdvisor: true

    ## Enable scraping /metrics/probes from kubelet's service
    ##
    probes: true

    ## Enable scraping /metrics/resource from kubelet's service
    ## This is disabled by default because container metrics are already exposed by cAdvisor
    ##
    resource: false
    # From kubernetes 1.18, /metrics/resource/v1alpha1 renamed to /metrics/resource
    resourcePath: "/metrics/resource/v1alpha1"

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    cAdvisorMetricRelabelings:
      # Drop less useful container CPU metrics.
      - sourceLabels: [__name__]
        action: drop
        regex: 'container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)'
      # Drop less useful container / always zero filesystem metrics.
      - sourceLabels: [__name__]
        action: drop
        regex: 'container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)'
      # Drop less useful / always zero container memory metrics.
      - sourceLabels: [__name__]
        action: drop
        regex: 'container_memory_(mapped_file|swap)'
      # Drop less useful container process metrics.
      - sourceLabels: [__name__]
        action: drop
        regex: 'container_(file_descriptors|tasks_state|threads_max)'
      # Drop container spec metrics that overlap with kube-state-metrics.
      - sourceLabels: [__name__]
        action: drop
        regex: 'container_spec.*'
      # Drop cgroup metrics with no pod.
      - sourceLabels: [id, pod]
        action: drop
        regex: '.+;'
    # - sourceLabels: [__name__, image]
    #   separator: ;
    #   regex: container_([a-z_]+);
    #   replacement: $1
    #   action: drop
    # - sourceLabels: [__name__]
    #   separator: ;
    #   regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
    #   replacement: $1
    #   action: drop

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    probesMetricRelabelings: []
    # - sourceLabels: [__name__, image]
    #   separator: ;
    #   regex: container_([a-z_]+);
    #   replacement: $1
    #   action: drop
    # - sourceLabels: [__name__]
    #   separator: ;
    #   regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
    #   replacement: $1
    #   action: drop

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    ## metrics_path is required to match upstream rules and charts
    cAdvisorRelabelings:
      - sourceLabels: [__metrics_path__]
        targetLabel: metrics_path
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    probesRelabelings:
      - sourceLabels: [__metrics_path__]
        targetLabel: metrics_path
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    resourceRelabelings:
      - sourceLabels: [__metrics_path__]
        targetLabel: metrics_path
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - sourceLabels: [__name__, image]
    #   separator: ;
    #   regex: container_([a-z_]+);
    #   replacement: $1
    #   action: drop
    # - sourceLabels: [__name__]
    #   separator: ;
    #   regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
    #   replacement: $1
    #   action: drop

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    ## metrics_path is required to match upstream rules and charts
    relabelings:
      - sourceLabels: [__metrics_path__]
        targetLabel: metrics_path
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping the kube controller manager
##
kubeControllerManager:
  enabled: false

  ## If your kube controller manager is not deployed as a pod, specify IPs it can be found on
  ##
  endpoints: []
  # - 10.141.4.22
  # - 10.141.4.23
  # - 10.141.4.24

  ## If using kubeControllerManager.endpoints only the port and targetPort are used
  ##
  service:
    enabled: true
    ## If null or unset, the value is determined dynamically based on target Kubernetes version due to change
    ## of default port in Kubernetes 1.22.
    ##
    port: null
    targetPort: null
    # selector:
    #   component: kube-controller-manager

  serviceMonitor:
    enabled: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""

    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    ## Enable scraping kube-controller-manager over https.
    ## Requires proper certs (not self-signed) and delegated authentication/authorization checks.
    ## If null or unset, the value is determined dynamically based on target Kubernetes version.
    ##
    https: null

    # Skip TLS certificate validation when scraping
    insecureSkipVerify: null

    # Name of the server to use when validating TLS certificate
    serverName: null

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar
## Component scraping coreDns. Use either this or kubeDns
##
coreDns:
  enabled: false
  service:
    port: 9153
    targetPort: 9153
    # selector:
    #   k8s-app: kube-dns
  serviceMonitor:
    enabled: true
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""

    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping kubeDns. Use either this or coreDns
##
kubeDns:
  enabled: false
  service:
    dnsmasq:
      port: 10054
      targetPort: 10054
    skydns:
      port: 10055
      targetPort: 10055
    # selector:
    #   k8s-app: kube-dns
  serviceMonitor:
    enabled: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""

    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    dnsmasqMetricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    dnsmasqRelabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping etcd
##
kubeEtcd:
  enabled: false

  ## If your etcd is not deployed as a pod, specify IPs it can be found on
  ##
  endpoints: []
  # - 10.141.4.22
  # - 10.141.4.23
  # - 10.141.4.24

  ## Etcd service. If using kubeEtcd.endpoints only the port and targetPort are used
  ##
  service:
    enabled: true
    port: 2381
    targetPort: 2381
    # selector:
    #   component: etcd

  ## Configure secure access to the etcd cluster by loading a secret into prometheus and
  ## specifying security configuration below. For example, with a secret named etcd-client-cert
  ##
  ## serviceMonitor:
  ##   scheme: https
  ##   insecureSkipVerify: false
  ##   serverName: localhost
  ##   caFile: /etc/prometheus/secrets/etcd-client-cert/etcd-ca
  ##   certFile: /etc/prometheus/secrets/etcd-client-cert/etcd-client
  ##   keyFile: /etc/prometheus/secrets/etcd-client-cert/etcd-client-key
  ##
  serviceMonitor:
    enabled: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""
    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""
    scheme: http
    insecureSkipVerify: false
    serverName: ""
    caFile: ""
    certFile: ""
    keyFile: ""

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping kube scheduler
##
kubeScheduler:
  enabled: false

  ## If your kube scheduler is not deployed as a pod, specify IPs it can be found on
  ##
  endpoints: []
  # - 10.141.4.22
  # - 10.141.4.23
  # - 10.141.4.24

  ## If using kubeScheduler.endpoints only the port and targetPort are used
  ##
  service:
    enabled: false
    ## If null or unset, the value is determined dynamically based on target Kubernetes version due to change
    ## of default port in Kubernetes 1.23.
    ##
    port: null
    targetPort: null
    # selector:
    #   component: kube-scheduler

  serviceMonitor:
    enabled: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""
    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""
    ## Enable scraping kube-scheduler over https.
    ## Requires proper certs (not self-signed) and delegated authentication/authorization checks.
    ## If null or unset, the value is determined dynamically based on target Kubernetes version.
    ##
    https: null

    ## Skip TLS certificate validation when scraping
    insecureSkipVerify: null

    ## Name of the server to use when validating TLS certificate
    serverName: null

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - sourceLabels: [__meta_kubernetes_pod_node_name]
    #   separator: ;
    #   regex: ^(.*)$
    #   targetLabel: nodename
    #   replacement: $1
    #   action: replace

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping kube proxy
##
kubeProxy:
  enabled: false

  ## If your kube proxy is not deployed as a pod, specify IPs it can be found on
  ##
  endpoints: []
  # - 10.141.4.22
  # - 10.141.4.23
  # - 10.141.4.24

  service:
    enabled: true
    port: 10249
    targetPort: 10249
    # selector:
    #   k8s-app: kube-proxy

  serviceMonitor:
    enabled: false
    ## Scrape interval. If not set, the Prometheus default scrape interval is used.
    ##
    interval: ""

    ## proxyUrl: URL of a proxy that should be used for scraping.
    ##
    proxyUrl: ""

    ## Enable scraping kube-proxy over https.
    ## Requires proper certs (not self-signed) and delegated authentication/authorization checks
    ##
    https: false

    ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    metricRelabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## RelabelConfigs to apply to samples before scraping
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
    ##
    relabelings: []
    # - action: keep
    #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
    #   sourceLabels: [__name__]

    ## Additional labels
    ##
    additionalLabels: {}
    #  foo: bar

## Component scraping kube state metrics
##
kubeStateMetrics:
  enabled: false
  ## This option is to create a service monitor for an existing installation
  serviceMonitor:
    enabled: false
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics

## Configuration for kube-state-metrics subchart
##
kube-state-metrics:
  enabled: false
  namespaceOverride: ""
  rbac:
    create: true
  releaseLabel: true
  prometheus:
    monitor:
      enabled: true

      ## Scrape interval. If not set, the Prometheus default scrape interval is used.
      ##
      interval: ""

      ## Scrape Timeout. If not set, the Prometheus default scrape timeout is used.
      ##
      scrapeTimeout: ""

      ## proxyUrl: URL of a proxy that should be used for scraping.
      ##
      proxyUrl: ""

      # Keep labels from scraped data, overriding server-side labels
      ##
      honorLabels: true

      ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      metricRelabelings: []
      # - action: keep
      #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
      #   sourceLabels: [__name__]

      ## RelabelConfigs to apply to samples before scraping
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      relabelings: []
      # - sourceLabels: [__meta_kubernetes_pod_node_name]
      #   separator: ;
      #   regex: ^(.*)$
      #   targetLabel: nodename
      #   replacement: $1
      #   action: replace

  selfMonitor:
    enabled: false

## Deploy node exporter as a daemonset to all nodes
##
nodeExporter:
  enabled: false
  serviceMonitor:
    ## This option is to create a service monitor for an existing installation
    enabled: false
    matchLabels:
      app.kubernetes.io/name: prometheus-node-exporter

## Configuration for prometheus-node-exporter subchart
##
prometheus-node-exporter:
  namespaceOverride: ""
  podLabels:
    ## Add the 'node-exporter' label to be used by serviceMonitor to match standard common usage in rules and grafana dashboards
    ##
    jobLabel: node-exporter
  releaseLabel: true
  extraArgs:
    - --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
    - --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
  service:
    portName: http-metrics
  prometheus:
    monitor:
      enabled: true

      jobLabel: jobLabel

      ## Scrape interval. If not set, the Prometheus default scrape interval is used.
      ##
      interval: ""

      ## How long until a scrape request times out. If not set, the Prometheus default scape timeout is used.
      ##
      scrapeTimeout: ""

      ## proxyUrl: URL of a proxy that should be used for scraping.
      ##
      proxyUrl: ""

      ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      metricRelabelings: []
      # - sourceLabels: [__name__]
      #   separator: ;
      #   regex: ^node_mountstats_nfs_(event|operations|transport)_.+
      #   replacement: $1
      #   action: drop

      ## RelabelConfigs to apply to samples before scraping
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      relabelings: []
      # - sourceLabels: [__meta_kubernetes_pod_node_name]
      #   separator: ;
      #   regex: ^(.*)$
      #   targetLabel: nodename
      #   replacement: $1
      #   action: replace
  rbac:
    ## If true, create PSPs for node-exporter
    ##
    pspEnabled: false```
swiatekm commented 1 year ago

As per the documentation, serviceMonitorSelector is a simple map of label name to label value, so what you did here:

      serviceMonitorSelector:
        matchLabels:
          "lightstep-metrics-collector"
         #"translation-service-monitor"
          # "document-translation"
          # "nmt-engines-monitor"
          # "document-translation-servicemonitor"
          # "lightstep-operator"
          # "lightstep-metrics-collector"

won't work. I can't exactly tell what you want to do here, but the following will work:

      serviceMonitorSelector:
        app: document-translation

I do agree that this is limiting, and we should consider using the standard selector mechanism just as prometheus-operator does. WDYT @jaronoff97 ?

jaronoff97 commented 1 year ago

Yep, this is an oversight on my part – my initial implementation was a stop gap solution for someone at the time, im not sure why i didn't go with what prometheus does. This is a good first issue for anyone who wants to take it... We can support both implementations, deprecating the old one and eventually removing it.

jaronoff97 commented 11 months ago

This will result in a breaking change and is this included in our v2 milestone.

Toaddyan commented 10 months ago

has this been completed in the v2 version? anywhere that I can pick up from?

jaronoff97 commented 10 months ago

@Toaddyan not yet, but i think @yuriolisa is working on this?

Toaddyan commented 10 months ago

sounds good. lmk!

yuriolisa commented 9 months ago

@Toaddyan, actually this issue has been resolved by @swiatekm-sumo on #2564.

swiatekm commented 9 months ago

@yuriolisa that change is only internal for now, this issue will be resolved in v1alpha2 CRDs.