NetApp / harvest

Open-metrics endpoint for ONTAP and StorageGRID
https://netapp.github.io/harvest/latest
Apache License 2.0
147 stars 36 forks source link

24.08 update - missing ton of metrics #3109

Closed db-wally007 closed 4 weeks ago

db-wally007 commented 4 weeks ago

A note for the community

Problem

After the upgrade, a lot of our custom dashboards are missing data. After digging into it, I noticed all Snapshot, SVM-DR, S3 and Volume capacity metrics are missing.

No config was changed, we only rolled out a new container 24.08 to replace 24.05 (to fix the analytics log errors) I also imported harvest dashboards, those are missing data as well.

Grafana import: bin/harvest grafana import -a https://grafana.ref:3000 -p netapp_

Container build:

FROM registry.access.redhat.com/ubi9/ubi

USER 0

ENV HARVEST_DOCKER=yes \
    SUMMARY="RHEL UBI9 platform for running Harvest by Netapp" \
    VAULT_ADDR=https://vault.ref:8200 \
    HTTP_PROXY=http://192.168.123.254:8080 \
    http_proxy=http://192.168.123.254:8080 \
    HTTPS_PROXY=http://192.168.123.254:8080 \
    https_proxy=http://192.168.123.254:8080 \
    NO_PROXY=127.0.0.1,localhost,.ref \
    no_proxy=127.0.0.1,localhost,.ref

LABEL summary="${SUMMARY}" \
      name="ubi9-netapp-harvest"

# List of packages to install
RUN dnf install -y jq hostname && yum -y clean all

# Configuration of the container
COPY refinst_ca.crt /etc/pki/ca-trust/source/anchors/refinst_ca.crt
RUN update-ca-trust
# Install latest version of Netapp Harvest
RUN mkdir -p /opt/harvest && \
    curl -L -o /opt/harvest.tar.gz $(curl -s https://api.github.com/repos/NetApp/harvest/releases/latest | jq -r ".assets[] | select(.name | test(\"tar.gz\")) | .browser_download_url") && \
    tar -xf /opt/harvest.tar.gz -C /opt/harvest --strip-components=1

Harvest config

---

Exporters:
  troy:
    exporter: Prometheus
    local_http_addr: 0.0.0.0
    port: 12990
    global_prefix: netapp_
  agora:
    exporter: Prometheus
    local_http_addr: 0.0.0.0
    port: 12991
    global_prefix: netapp_

Pollers:
  troy:
    datacenter: EQX
    addr: troy-cluster.deutsche-boerse.de
    auth_style: basic_auth
    username: $__env{NETAPP_HARVEST_READONLY_USERNAME}
    password: $__env{NETAPP_HARVEST_READONLY_PASSWORD}
    use_insecure_tls: true
    exporters:
      - troy
    collectors:
      - Rest
      - RestPerf
      - Ems
  agora:
    datacenter: EQX
    addr: agora-cluster.deutsche-boerse.de
    auth_style: basic_auth
    username: $__env{NETAPP_HARVEST_READONLY_USERNAME}
    password: $__env{NETAPP_HARVEST_READONLY_PASSWORD}
    use_insecure_tls: true
    exporters:
      - agora
    collectors:
      - Rest
      - RestPerf
      - Ems

Example metric that is missing after the upgrade to 24.08:

harvest

Configuration

No response

Poller

both

Version

24.08

Poller logs

root@podman1:~>podman logs netapp-harvest-exporter
  Datacenter | Poller | PID | PromPort | Status
-------------+--------+-----+----------+----------
  EQX        | troy   |   7 |    12990 | running
  EQX        | agora  |   8 |    12991 | running
root@podman1:~>

OS and platform

RHEL 9.4 , podman

ONTAP or StorageGRID version

Netapp 9.13.1P7

Additional Context

No response

References

No response

db-wally007 commented 4 weeks ago

please ignore. the issue was with the prometheus dropping metrics.

cgrinds commented 4 weeks ago

Glad to hear you got it sorted out @db-wally007