canonical / prometheus-juju-exporter

GNU General Public License v3.0
2 stars 8 forks source link

Handle "broken" Juju models #17

Closed przemeklal closed 1 year ago

przemeklal commented 1 year ago

Currently, if the exporter encounters a model in an error state (e.g. throwing errors on juju status -m model-name it simply crashes out. Since by default, the exporter tries to access all models in a juju controller, it will fail to export any data, even if all other models are fine.

Example traceback:

Jan 12 14:41:23 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:24 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:24 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:25 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:25 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:27 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unknown facade EnvironUpgrader
Jan 12 14:41:27 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: unexpected facade EnvironUpgrader found, unable to decipher version to use
Jan 12 14:41:36 juju-f5291e-3-lxd-28 prometheus-juju-exporter.prometheus-juju-exporter[21855]: 2023-01-12 14:41:36,090 ERROR - Collection job resulted in error: model cache: model "f8dc82ca-ef1b-461b-80d0-36a0f36bb910" did not appear >
Jan 12 14:41:36 juju-f5291e-3-lxd-28 systemd[1]: snap.prometheus-juju-exporter.prometheus-juju-exporter.service: Main process exited, code=exited, status=1/FAILURE
Jan 12 14:41:36 juju-f5291e-3-lxd-28 systemd[1]: snap.prometheus-juju-exporter.prometheus-juju-exporter.service: Failed with result 'exit-code'.

The same model causes issues for the juju client as well:

$ juju status -m openstack
ERROR model cache: model "f8dc82ca-ef1b-461b-80d0-36a0f36bb910" did not appear in cache timeout

The possible workaround is to create a dedicated Juju user with login access and fine-tune its permissions:

juju grant prometheus-juju-exporter admin controller
juju grant prometheus-juju-exporter admin model1
juju grant prometheus-juju-exporter admin model2
...

Any user with the superuser access level will try to access all models resulting in the same crash.