canonical / testflinger

https://testflinger.readthedocs.io/en/latest/
GNU General Public License v3.0
9 stars 13 forks source link

Send provision log data to the server to display on agent-details page #293

Open plars opened 1 week ago

plars commented 1 week ago

Description

This adds some additional logging for the previous provision attempts in the agent-details page. When provisioning on an agent passes, or fails, we log this so that you can browse to the agent-details page and find out if it has typically been provisioning successfully or unsuccessfully lately. This can provide valuable insight if you are wondering if a failure to provision a particular device is a fluke, or if it has been failing all of the recent attempts.

image

Resolved issues

CERTTF-326

Documentation

It's used internally through the agent, nothing extra needed from a user standpoint. However the API documentation has been updated through the schema so that swagger-docs are updated. README.rst for server was also updated with the new API.

Web service API changes

Yes, [POST] /v1/agents/provision_logs/<agent_name> was added for posting a json event log for the provision phase. I considered reusing the existing [POST] /v1/agents/data/<agent_name> API, but decided against it for two reasons:

  1. That one is for pushing general agent data such as name, queues it knows about, etc. This isn't likely to change and typically only gets updated when the agent comes online. Pushing provision logs will happen every time it attempts to provision the device, and these events do not coincide
  2. With a nosql db, the typical pattern is to store all related data together, but in this case, it made more sense to create a new collection for the provision_logs. Mostly, because from an API side, we would have had a mixture of things like the name and list of queues that simply get replaced. However with the provision_log, you will be adding to an array of log messages (capped at 100 latest entries), or creating it if it doesn't exist. So if we used the previous API, we would have to handle the data differently depending on what type of data we received

I used POST for this, which is somewhat generic. We didn't get too picky with post vs patch vs put in v1 of the API, perhaps something to re-examine in v2 someday. In this case, I think a good argument could be made for patch also, but I felt like post was better since we're not really "patching" (or updating" some subset of the values, instead we're adding new data to an array/list.

If we do ever choose to modify or deprecate this API, it should be pretty straightforward. It's only expected to be used by the agents, not the end user or cli.

Tests

Unit tests were added, as well as local testing. I made some changes to the create_sample_data.py script to also inject some failed provision events. You can call this script with a local server setup. It will use the API on the server you point it at to create fake data and inject those provision_log events so that you can see that they show up in the db and ui.