GoogleCloudPlatform / prometheus-engine

Google Cloud Managed Service for Prometheus libraries and manifests.
https://g.co/cloud/managedprometheus
Apache License 2.0
191 stars 89 forks source link

fix: use clusterpodmonitoring for nvidia-dcgm #1002

Closed pintohutch closed 3 months ago

pintohutch commented 3 months ago

NVIDIA dcgm-exporter captures and records the "pod", "namespace", and "container" labels. We should honor those in our relabeling.

Hence, we use a ClusterPodMonitoring with an empty .targetLabels.metadata (akin to what we do for KSM) to preserve them.

pintohutch commented 3 months ago

cc @huygaa11