Open Nils98Ar opened 8 months ago
There is a pretty old version included at the moment. Will be better with the next OSISM release.
@berendt Anyway strange that the issues suddenly started 7 days ago... but waiting for the next OSISM release would be okay for me.
Maybe the OpenStack Health Mon that we are running since 12-13 days could lead to an increased metrics volume?
Maybe the OpenStack Health Mon that we are running since 12-13 days could lead to an increased metrics volume?
Yes, definitely. The OpenStack Exporter simply hits the API and thus the DB. OHM generates a lot of resources. Depending on the interval at which you scrape and how your control plane is equipped, this can generate a considerable load.
Defaults where:
prometheus_scrape_interval: "60s"
prometheus_openstack_exporter_interval: "{{ prometheus_scrape_interval }}"
prometheus_openstack_exporter_timeout: "45s"
I've now configured this and we will see if it helps:
prometheus_openstack_exporter_interval: "195s"
prometheus_openstack_exporter_timeout: "180s"
I'm scared that we can only document this for the time being as it hasn't improved with the last release of the exporter.
The GET request takes 2:20 minutes which is longer than the scrape_timeout:
These are the counts of "non info" log entries of the openstack_exporter container with the entries for not deployed services (baremetal, container-infra, database, orchestration) excluded:
At least the first one seems not right, I am not sure about the last three.