fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.98k stars 413 forks source link

Leverage new `POST /fleet/automations/reset` in the UI policy modal to run automations for previously failing hosts #9054

Open lukeheath opened 1 year ago

lukeheath commented 1 year ago

Goal

As a Fleet user that just turned on automations for a policy, I expect all failing hosts to be included in a webhook payload. This way, I can send emails to all my hosts and notify end users to remediate.

Problem

Today, if a user turns on automations for a policy, and several hosts are already failing, these hosts will not be included in a webhook requests (or ticket).

An API endpoint is available to run automations for previously failing hosts at POST /fleet/automations/reset. However, we are unsure when this endpoint should be called, and with what parameters. Need clarity before moving forward with frontend implementation.

Related

noahtalerman commented 1 year ago

Noah: After talking to users, the expected behavior is for all failing hosts to be included in a webhook when I turn automations on for a policy.

Noah: We haven't discussed expected behavior when I turn automations off and then back on. Are all failing hosts included in the webhook? I think yes.

RachelElysia commented 1 year ago

@noahtalerman This made it to my plate, but I'm unsure what I should be building. What is the user journey?

A user goes to set a policy automation in the policy automation modal. When clicking save, a user expects to see all hosts including failing hosts as a part of the newly saved policy automation? Therefor the frontend needs to send a POST request to /fleet/automations/reset on save?

...Is that correct?

noahtalerman commented 1 year ago

What is the user journey?

User story: As a Fleet user that just turned on automations for a policy, I expect all failing hosts to be included in a webhook payload. This way, I can send emails to all my hosts and notify end users to remediate.

So, I think the user journey in the UI looks like this: Fleet user creates a policy => some hosts start to fail the policy => Fleet user turns on automations for the policy and clicks "Save" => all failing hosts are included in a webhook payload.

noahtalerman commented 1 year ago

@RachelElysia here's how I imagine we'd accomplish the user story in the above comment: When the user clicks "Save," the frontend sends a request to reset automations for any policies that went from automations "Off" to automations "On."

Are we able to do that? What do you think?

RachelElysia commented 1 year ago

@lucasmrod

Context: I started working on a frontend fix for this as specced but conversation lead to confirming that this can be changed via fleetctl and so the fix should be on the backend.

Backend change:

  1. Global failing policies webhook change: When there is a post request to fleet/config specifically config.webhooksettings.failing_policies_webhook.policy_ids, it should automatically send over the new policy_ids (the difference) to whatever fleet/automations/reset is doing

  2. Team failing policies webhook change: When there is a post request to team/${id} specifically team.webhooksettings.failing_policies_webhook.policy_ids, it should automatically send over the new policy_ids along with the team_id (the difference) to whatever fleet/automations/reset is doing

That way this is not a frontend fix that only works in the UI but not when a user is updating policy automations using fleetctl.

Original backend ticket: https://github.com/fleetdm/fleet/issues/7808 API built for reseting failing policies to reset to check hosts again: https://github.com/fleetdm/fleet/blob/main/docs/Using-Fleet/REST-API.md#run-automation-for-all-failing-hosts-of-a-policy

lucasmrod commented 1 year ago

@RachelElysia OK, I believe I get the idea.

Q: Do we still want (need) to keep /api/latest/fleet/automations/reset as part of our public API?

zayhanlon commented 1 year ago

@lucasmrod If you have time - prepare to have this estimated for tomorrow?