newrelic / nri-prometheus

Fetch metrics in the Prometheus metrics inside or outside Kubernetes and send them to the New Relic Metrics platform.
Apache License 2.0
39 stars 46 forks source link

Not getting custom metrics from Prometheus endpoint #44

Closed BartoszZawadzki closed 2 years ago

BartoszZawadzki commented 4 years ago

I have an app that runs inside single container (and single pod). There's also a Prometheus instance running inside that container that picks-up metrics from various app processes.

I've deployed NewRelic with nri-prometheus and generally I get all the infrastructure metrcis, events, logs and so on, however I do not get any custom metrics from my prometheus instance.

Logs: nri-prometheus-75b664b985-vfptw nri-prometheus time="2020-04-29T11:53:24Z" level=debug msg="fetching URL: {http 100.96.2.226:21090 /federate false }" component=Fetcher target=pzu-fwserver-sts-0

Pod annotations: prometheus.io/path: /federate prometheus.io/port: "21090" prometheus.io/scrape: "true"

Config:

---
apiVersion: v1
data:
  config.yaml: |
    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'

        params:
          'match[]':
            - '{job="prometheus"}'
            - '{__name__=~"job:.*"}'

    # The name of your cluster. It's important to match other New Relic products to relate the data.
    cluster_name: "k8s-dev.XXX.XXX"

    # How often the integration should run. Defaults to 30s.
    # scrape_duration: "30s"

    # The HTTP client timeout when fetching data from endpoints. Defaults to 5s.
    # scrape_timeout: "5s"

    # Wether the integration should run in verbose mode or not. Defaults to false.
    verbose: true

    # Wether the integration should skip TLS verification or not. Defaults to false.
    insecure_skip_verify: true

    # The label used to identify scrapable targets. Defaults to "prometheus.io/scrape".
    scrape_enabled_label: "prometheus.io/scrape"

    # Whether k8s nodes need to be labelled to be scraped or not. Defaults to true.
    require_scrape_enabled_label_for_nodes: true

    # targets:
    #   - description: Secure etcd example
    #     urls: ["https://192.168.3.1:2379", "https://192.168.3.2:2379", "https://192.168.3.3:2379"]
    #     tls_config:
    #       ca_file_path: "/etc/etcd/etcd-client-ca.crt"
    #       cert_file_path: "/etc/etcd/etcd-client.crt"
    #       key_file_path: "/etc/etcd/etcd-client.key"

    # Proxy to be used by the emitters when submitting metrics. It should be
    # in the format [scheme]://[domain]:[port].
    # The emitter is the component in charge of sending the scraped metrics.
    # This proxy won't be used when scraping metrics from the targets.
    # By default it's empty, meaning that no proxy will be used.
    # emitter_proxy: "http://localhost:8888"

    # Certificate to add to the root CA that the emitter will use when
    # verifying server certificates.
    # If left empty, TLS uses the host's root CA set.
    # emitter_ca_file: "/path/to/cert/server.pem"

    # Whether the emitter should skip TLS verification when submitting data.
    # Defaults to false.
    # emitter_insecure_skip_verify: false

    # Histogram support is based on New Relic's guidelines for higher
    # level metrics abstractions https://github.com/newrelic/newrelic-exporter-specs/blob/master/Guidelines.md.
    # To better support visualization of this data, percentiles are calculated
    # based on the histogram metrics and sent to New Relic.
    # By default, the following percentiles are calculated: 50, 95 and 99.
    #
    # percentiles:
    #   - 50
    #   - 95
    #   - 99

    # transformations:
    #   - description: "General processing rules"
    #     rename_attributes:
    #       - metric_prefix: ""
    #         attributes:
    #           container_name: "containerName"
    #           pod_name: "podName"
    #           namespace: "namespaceName"
    #           node: "nodeName"
    #           container: "containerName"
    #           pod: "podName"
    #           deployment: "deploymentName"
    #     ignore_metrics:
    #       # Ignore all the metrics except the ones listed below.
    #       # This is a list that complements the data retrieved by the New
    #       # Relic Kubernetes Integration, that's why Pods and containers are
    #       # not included, because they are already collected by the
    #       # Kubernetes Integration.
    #       - except:
    #         - kube_hpa_
    #         - kube_daemonset_
    #         - kube_statefulset_
    #         - kube_endpoint_
    #         - kube_service_
    #         - kube_limitrange
    #         - kube_node_
    #         - kube_poddisruptionbudget_
    #         - kube_resourcequota
    #         - nr_stats
    #     copy_attributes:
    #       # Copy all the labels from the timeseries with metric name
    #       # `kube_hpa_labels` into every timeseries with a metric name that
    #       # starts with `kube_hpa_` only if they share the same `namespace`
    #       # and `hpa` labels.
    #       - from_metric: "kube_hpa_labels"
    #         to_metrics: "kube_hpa_"
    #         match_by:
    #           - namespace
    #           - hpa
    #       - from_metric: "kube_daemonset_labels"
    #         to_metrics: "kube_daemonset_"
    #         match_by:
    #           - namespace
    #           - daemonset
    #       - from_metric: "kube_statefulset_labels"
    #         to_metrics: "kube_statefulset_"
    #         match_by:
    #           - namespace
    #           - statefulset
    #       - from_metric: "kube_endpoint_labels"
    #         to_metrics: "kube_endpoint_"
    #         match_by:
    #           - namespace
    #           - endpoint
    #       - from_metric: "kube_service_labels"
    #         to_metrics: "kube_service_"
    #         match_by:
    #           - namespace
    #           - service
    #       - from_metric: "kube_node_labels"
    #         to_metrics: "kube_node_"
    #         match_by:
    #           - namespace
    #           - node
kind: ConfigMap
metadata:
  name: nri-prometheus-cfg
  namespace: monitoring

Deployment:


apiVersion: apps/v1
kind: Deployment
metadata:
  name: nri-prometheus
  namespace: monitoring
  labels:
    app: nri-prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nri-prometheus
  template:
    metadata:
      labels:
        app: nri-prometheus
        prometheus.io/scrape: "true"
    spec:
      serviceAccountName: nri-prometheus
      containers:
        - name: nri-prometheus
          image: newrelic/nri-prometheus:1.3.0
          args:
            - "--configfile=/etc/nri-prometheus/config.yaml"
          ports:
            - containerPort: 8080
          volumeMounts:
            - name: config-volume
              mountPath: /etc/nri-prometheus/
          env:
            - name: "LICENSE_KEY"
              value: "XXX"
            - name: "BEARER_TOKEN_FILE"
              value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
            - name: "CA_FILE"
              value: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
      volumes:
        - name: config-volume
          configMap:
            name: nri-prometheus-cfg
j0sh3rs commented 4 years ago

+1 also seeing this issue where query params are not properly passed or utilized in configs (in my case, I'm trying to pull in the prometheus metrics from Vault https://www.vaultproject.io/docs/configuration/telemetry#prometheus )

nri-prometheus fails to properly respect the prometheus params block in config.yaml, and also does not correctly parse the value when attached to the prometheus.io/path annotation due to url-encoding the path (which may be a prometheus problem)

alejandrodnm commented 4 years ago

Hi @BartoszZawadzki sorry for the late response.

To debug this easier, could you please share an example of the metrics returned by:

http 100.96.2.226:21090 /federate

And point out what metrics are missing, that way we could try to reproduce the issue.

Also, we just released a version which fixed some issues with missing counter metrics, it might be worth trying it first.

One last thing, in your config you have

    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'

        params:
          'match[]':
            - '{job="prometheus"}'
            - '{__name__=~"job:.*"}'

That's not something the integration support.

@j0sh3rs we released a new version that fixes the problem with the encodings in the prometheus.io/path annotation.