ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.82k stars 3.39k forks source link

AWX Monitoring: Metrics endpoint retain old data & not updated #15333

Open bettyhey opened 1 month ago

bettyhey commented 1 month ago

Please confirm the following

Bug Summary

Is there a way to display only the current active nodes in /api/v2/metrics? It was discovered that /api/v2/metrics endpoint retain old data (e.g. same hostname with different uuid) when a node is disconnected from AWX. The node will be gone in /api/v2/instances and AWX GUI but still showing up in the /api/v2/metrics.

Is there a workaround for this issue? The ask is to monitor a node when it lost connectivity to AWX, however, the metrics endpoint retains old data (for at least a day or so), it's not ideal to utilise it for monitoring.

AWX version

2.9.23

Select the relevant components

Installation method

docker development environment

Modifications

no

Ansible version

Irrelevant

Operating system

Irrelevant

Web browser

No response

Steps to reproduce

Disconnect a node (terminate/shutdown) from AWX, then observer /api/v2/metrics endpoint.

Expected results

It's expected to have the node removed from AWX metrics endpoint when they are disconnected and gone in AWX GUI and /api/v2/instances endpoint.

Actual results

Node not removed, seeing old data (same node but different uuid as it's now reconnected, hence a different uuid was granted). image

Additional information

No response

thedoubl3j commented 1 month ago

@bettyhey we don't know of any work arounds for and this is a legitimate issue. could you close this and open an enhancement request with similar information so that we can track it better since this is not a bug but a lack of coverage.

fosterseth commented 1 month ago

if you have a different way of detecting when nodes are down permanently, you might be able to log into a running task container and run the awx-manage deprovision_instance command to delete it from the database and those entries should go away