canonical / testflinger

https://testflinger.readthedocs.io/en/latest/
GNU General Public License v3.0
12 stars 20 forks source link

Add api and view for queue wait time percentiles #319

Closed plars closed 3 months ago

plars commented 3 months ago

Description

Store wait times for queues and add an API as well as a view on the queue_detail page that shows the percentiles for these wait times. The API can be used to retrieve metrics on individual queue(s) or for all queues at once. For the API, the wait times are returned in seconds, but on the view, we convert this to a more human readable form (see screenshot)

image

Resolved issues

CERTTF-333

Documentation

Added README section about the new API

Web service API changes

**[GET] /v1/queues/wait_times** - get wait time metrics - optionally take a list of queues

- Parameters:

  - queue (array): list of queues to get wait time metrics for

- Returns:

  JSON mapping of queue names to wait time metrics

- Example:

  .. code-block:: console

    $ curl http://localhost:8000/v1/queues/wait_times?queue=foo\&queue=bar%     

This adds a new database collection for queue_wait_times. For displaying a single queue, I expect this to be quite fast and have no noticeable effect on page load time for queue_details. I did some testing of the code that does the calculation of the percentiles and it was quite fast on my system - potentially it could be slower across thousands of queues if we frequently call the API with no queues specified to get all queues, but we could consider capping the number of data points per queue if this becomes a problem. The space consumed is small enough and the performance is good enough that I've left it uncapped for now.

Tests

Tested locally and unit tests added. The wait time data doesn't exist until this is deployed on the server though (no agent changes needed), so we won't start collecting the data until then and it won't display until it has data for a given queue.