Not getting custom metrics from Prometheus endpoint

BartoszZawadzki commented 4 years ago

I have an app that runs inside single container (and single pod). There's also a Prometheus instance running inside that container that picks-up metrics from various app processes.

I've deployed NewRelic with nri-prometheus and generally I get all the infrastructure metrcis, events, logs and so on, however I do not get any custom metrics from my prometheus instance.

Logs: nri-prometheus-75b664b985-vfptw nri-prometheus time="2020-04-29T11:53:24Z" level=debug msg="fetching URL: {http 100.96.2.226:21090 /federate false }" component=Fetcher target=pzu-fwserver-sts-0

Pod annotations: prometheus.io/path: /federate prometheus.io/port: "21090" prometheus.io/scrape: "true"

Config:

---
apiVersion: v1
data:
  config.yaml: |
    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'

        params:
          'match[]':
            - '{job="prometheus"}'
            - '{__name__=~"job:.*"}'

    # The name of your cluster. It's important to match other New Relic products to relate the data.
    cluster_name: "k8s-dev.XXX.XXX"

    # How often the integration should run. Defaults to 30s.
    # scrape_duration: "30s"

    # The HTTP client timeout when fetching data from endpoints. Defaults to 5s.
    # scrape_timeout: "5s"

    # Wether the integration should run in verbose mode or not. Defaults to false.
    verbose: true

    # Wether the integration should skip TLS verification or not. Defaults to false.
    insecure_skip_verify: true

    # The label used to identify scrapable targets. Defaults to "prometheus.io/scrape".
    scrape_enabled_label: "prometheus.io/scrape"

    # Whether k8s nodes need to be labelled to be scraped or not. Defaults to true.
    require_scrape_enabled_label_for_nodes: true

    # targets:
    #   - description: Secure etcd example
    #     urls: ["https://192.168.3.1:2379", "https://192.168.3.2:2379", "https://192.168.3.3:2379"]
    #     tls_config:
    #       ca_file_path: "/etc/etcd/etcd-client-ca.crt"
    #       cert_file_path: "/etc/etcd/etcd-client.crt"
    #       key_file_path: "/etc/etcd/etcd-client.key"

    # Proxy to be used by the emitters when submitting metrics. It should be
    # in the format [scheme]://[domain]:[port].
    # The emitter is the component in charge of sending the scraped metrics.
    # This proxy won't be used when scraping metrics from the targets.
    # By default it's empty, meaning that no proxy will be used.
    # emitter_proxy: "http://localhost:8888"

    # Certificate to add to the root CA that the emitter will use when
    # verifying server certificates.
    # If left empty, TLS uses the host's root CA set.
    # emitter_ca_file: "/path/to/cert/server.pem"

    # Whether the emitter should skip TLS verification when submitting data.
    # Defaults to false.
    # emitter_insecure_skip_verify: false

    # Histogram support is based on New Relic's guidelines for higher
    # level metrics abstractions https://github.com/newrelic/newrelic-exporter-specs/blob/master/Guidelines.md.
    # To better support visualization of this data, percentiles are calculated
    # based on the histogram metrics and sent to New Relic.
    # By default, the following percentiles are calculated: 50, 95 and 99.
    #
    # percentiles:
    #   - 50
    #   - 95
    #   - 99

    # transformations:
    #   - description: "General processing rules"
    #     rename_attributes:
    #       - metric_prefix: ""
    #         attributes:
    #           container_name: "containerName"
    #           pod_name: "podName"
    #           namespace: "namespaceName"
    #           node: "nodeName"
    #           container: "containerName"
    #           pod: "podName"
    #           deployment: "deploymentName"
    #     ignore_metrics:
    #       # Ignore all the metrics except the ones listed below.
    #       # This is a list that complements the data retrieved by the New
    #       # Relic Kubernetes Integration, that's why Pods and containers are
    #       # not included, because they are already collected by the
    #       # Kubernetes Integration.
    #       - except:
    #         - kube_hpa_
    #         - kube_daemonset_
    #         - kube_statefulset_
    #         - kube_endpoint_
    #         - kube_service_
    #         - kube_limitrange
    #         - kube_node_
    #         - kube_poddisruptionbudget_
    #         - kube_resourcequota
    #         - nr_stats
    #     copy_attributes:
    #       # Copy all the labels from the timeseries with metric name
    #       # `kube_hpa_labels` into every timeseries with a metric name that
    #       # starts with `kube_hpa_` only if they share the same `namespace`
    #       # and `hpa` labels.
    #       - from_metric: "kube_hpa_labels"
    #         to_metrics: "kube_hpa_"
    #         match_by:
    #           - namespace
    #           - hpa
    #       - from_metric: "kube_daemonset_labels"
    #         to_metrics: "kube_daemonset_"
    #         match_by:
    #           - namespace
    #           - daemonset
    #       - from_metric: "kube_statefulset_labels"
    #         to_metrics: "kube_statefulset_"
    #         match_by:
    #           - namespace
    #           - statefulset
    #       - from_metric: "kube_endpoint_labels"
    #         to_metrics: "kube_endpoint_"
    #         match_by:
    #           - namespace
    #           - endpoint
    #       - from_metric: "kube_service_labels"
    #         to_metrics: "kube_service_"
    #         match_by:
    #           - namespace
    #           - service
    #       - from_metric: "kube_node_labels"
    #         to_metrics: "kube_node_"
    #         match_by:
    #           - namespace
    #           - node
kind: ConfigMap
metadata:
  name: nri-prometheus-cfg
  namespace: monitoring

Deployment:


apiVersion: apps/v1
kind: Deployment
metadata:
  name: nri-prometheus
  namespace: monitoring
  labels:
    app: nri-prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nri-prometheus
  template:
    metadata:
      labels:
        app: nri-prometheus
        prometheus.io/scrape: "true"
    spec:
      serviceAccountName: nri-prometheus
      containers:
        - name: nri-prometheus
          image: newrelic/nri-prometheus:1.3.0
          args:
            - "--configfile=/etc/nri-prometheus/config.yaml"
          ports:
            - containerPort: 8080
          volumeMounts:
            - name: config-volume
              mountPath: /etc/nri-prometheus/
          env:
            - name: "LICENSE_KEY"
              value: "XXX"
            - name: "BEARER_TOKEN_FILE"
              value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
            - name: "CA_FILE"
              value: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
      volumes:
        - name: config-volume
          configMap:
            name: nri-prometheus-cfg

j0sh3rs commented 4 years ago

+1 also seeing this issue where query params are not properly passed or utilized in configs (in my case, I'm trying to pull in the prometheus metrics from Vault https://www.vaultproject.io/docs/configuration/telemetry#prometheus )

nri-prometheus fails to properly respect the prometheus params block in config.yaml, and also does not correctly parse the value when attached to the prometheus.io/path annotation due to url-encoding the path (which may be a prometheus problem)

alejandrodnm commented 4 years ago

Hi @BartoszZawadzki sorry for the late response.

To debug this easier, could you please share an example of the metrics returned by:

http 100.96.2.226:21090 /federate

And point out what metrics are missing, that way we could try to reproduce the issue.

Also, we just released a version which fixed some issues with missing counter metrics, it might be worth trying it first.

One last thing, in your config you have

    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'

        params:
          'match[]':
            - '{job="prometheus"}'
            - '{__name__=~"job:.*"}'

That's not something the integration support.

@j0sh3rs we released a new version that fixes the problem with the encodings in the prometheus.io/path annotation.

newrelic / nri-prometheus

Not getting custom metrics from Prometheus endpoint #44