pryorda / vmware_exporter

VMWare vCenter Exporter for Prometheus
BSD 3-Clause "New" or "Revised" License
513 stars 195 forks source link

Boot time metric doesn't show all running VMs #355

Open gprakosa opened 1 year ago

gprakosa commented 1 year ago

Hi, I have 73 VMs running but just 47 VM have boot time value. Anyone experience same issue?

Both exporter & prometheus are on k8s 1.25.4 with Image Tag v0.18.4 with following configuration:

data:
  ...
  VSPHERE_IGNORE_SSL: "True"
  VSPHERE_COLLECT_HOSTS: "True"
  VSPHERE_COLLECT_DATASTORES: "False"
  VSPHERE_COLLECT_VMS: "True"
  VSPHERE_COLLECT_VMGUESTS: "False"
  VSPHERE_COLLECT_SNAPSHOTS: "False"
  VSPHERE_FETCH_ALARMS: "True"

image

lethargosapatheia commented 1 year ago

I've just discovered I have the exact same problem. The problem occurs when I migrate virtual machines to other nodes. The metrics simply disappear.

karakumium commented 1 year ago

I've just discovered I have the exact same problem. The problem occurs when I migrate virtual machines to other nodes. The metrics simply disappear.

I've got the same issue

endresjo commented 1 year ago

Same problem here, but with metric "vmware_vm_disk_usage_average". Also after doing vMotion on VM's during maintenance.

vCenter version 7.0.3.01200.

lethargosapatheia commented 1 year ago

The project seems to have become rather stale, unfortunately. At least that's my feeling. The problem is that there aren't many other solutions. I'll probably have to find a way to actually understand what's happening and make some changes. Hopefully it isn't overly complicated.

endresjo commented 1 year ago

Also tried reading vSphere metrics with Telegraf docker container using the VMware vSphere Telegraf plugin and this output config to expose Prometheus values:

[[outputs.prometheus_client]]
listen = ":9126"
metric_version = 2
path="/metrics"
string_as_label = true
export_timestamp = true

Having some more metrics with telegraf I also noticed:

Maybe this helps. Haven't looked into the code yet. For the moment I switched to using Telegraf as metrics exporter for vmware/vsphere. Scrape performance also seems to be comparable (scrape durations: vmware_exporter: 613.565ms vs. telegraf: 476.650ms).