canonical / prometheus-juju-exporter

GNU General Public License v3.0
2 stars 8 forks source link

Missing machines in the exported data #24

Closed przemeklal closed 1 year ago

przemeklal commented 1 year ago

The exporter returned only 2 LXDs for a model containing 63 LXDs and 38 KVMs/metals.

$ curl http://10.35.101.176:5000/metrics
# HELP juju_machine_state Running status of juju machines
# TYPE juju_machine_state gauge
juju_machine_state{cloud_name="redacted",customer="redacted",hostname="juju-bionic-1",job="prometheus-juju-exporter",juju_model="controller",type="kvm"} 1.0
juju_machine_state{cloud_name="redacted",customer="redacted",hostname="juju-bionic-2",job="prometheus-juju-exporter",juju_model="controller",type="kvm"} 1.0
juju_machine_state{cloud_name="redacted",customer="redacted",hostname="juju-bionic-3",job="prometheus-juju-exporter",juju_model="controller",type="kvm"} 1.0
juju_machine_state{cloud_name="redacted",customer="redacted",hostname="juju-3fae13-23-lxd-1",job="prometheus-juju-exporter",juju_model="openstack",type="lxd"} 1.0
juju_machine_state{cloud_name="redacted",customer="redacted",hostname="juju-3fae13-46-lxd-3",job="prometheus-juju-exporter",juju_model="openstack",type="lxd"} 1.0

I tried re-deploying, changing juju user permissions, and using all 3 juju controller IPs but the result was always the same.

przemeklal commented 1 year ago

Shared juju status --format yaml with @agileshaw privately

przemeklal commented 1 year ago

Logs: https://private-fileshare.canonical.com/~przemeklal/p-j-e-incomplete-data.log

There are a lot of these lines:

2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,604 DEBUG - Checking machine status for None

in the affected model:

2023-02-24T08:46:05Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:05,585 DEBUG - List of machines in model 55ab2ebd-6e66-4d23-8549-aee9f03fae13: ['0', '1', '10', '11', '14', '17', '2', '22', '23', '25', '26', '27', '28', '29', '3', '30', '31', '34', '35', '37', '38', '4', '45', '46', '47', '49', '5', '53', '59', '6', '61', '66', '67', '7', '70', '71', '8', '9']
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,597 DEBUG - Finish getting machine status for model openstack
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,597 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,598 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,598 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,599 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,599 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,599 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,600 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,601 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,601 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,601 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,602 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,602 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,602 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,603 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,603 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,603 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,604 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,604 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,604 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,605 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,605 DEBUG - Checking machine status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,605 DEBUG - Checking container status for juju-3fae13-23-lxd-1
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,605 DEBUG - Adding container juju-3fae13-23-lxd-1 to cache
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,605 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,606 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,606 DEBUG - Checking container status for None
2023-02-24T08:46:06Z prometheus-juju-exporter.prometheus-juju-exporter[12693]: 2023-02-24 08:46:06,607 DEBUG - Checking container status for None
agileshaw commented 1 year ago

After further investigation, the issue is cause by python-libjuju providing mal-formated juju status output data. For this particular model, the hostname field in some machines is missing and the value we normally see in hostname is shown under Instance_id filed instead.

A quick fix is to add a fall-back method, asking prometheus-juju-exporter to check Instance_id instead when hostname is not present.