Open lukeheath opened 1 year ago
Noah: After talking to users, the expected behavior is for all failing hosts to be included in a webhook when I turn automations on for a policy.
Noah: We haven't discussed expected behavior when I turn automations off and then back on. Are all failing hosts included in the webhook? I think yes.
@noahtalerman This made it to my plate, but I'm unsure what I should be building. What is the user journey?
A user goes to set a policy automation in the policy automation modal. When clicking save, a user expects to see all hosts including failing hosts as a part of the newly saved policy automation? Therefor the frontend needs to send a POST request to /fleet/automations/reset on save?
...Is that correct?
What is the user journey?
User story: As a Fleet user that just turned on automations for a policy, I expect all failing hosts to be included in a webhook payload. This way, I can send emails to all my hosts and notify end users to remediate.
So, I think the user journey in the UI looks like this: Fleet user creates a policy => some hosts start to fail the policy => Fleet user turns on automations for the policy and clicks "Save" => all failing hosts are included in a webhook payload.
@RachelElysia here's how I imagine we'd accomplish the user story in the above comment: When the user clicks "Save," the frontend sends a request to reset automations for any policies that went from automations "Off" to automations "On."
Are we able to do that? What do you think?
@lucasmrod
Context: I started working on a frontend fix for this as specced but conversation lead to confirming that this can be changed via fleetctl and so the fix should be on the backend.
Backend change:
Global failing policies webhook change:
When there is a post request to fleet/config
specifically config.webhooksettings.failing_policies_webhook.policy_ids
, it should automatically send over the new policy_ids
(the difference) to whatever fleet/automations/reset
is doing
Team failing policies webhook change:
When there is a post request to team/${id}
specifically team.webhooksettings.failing_policies_webhook.policy_ids
, it should automatically send over the new policy_ids
along with the team_id
(the difference) to whatever fleet/automations/reset
is doing
That way this is not a frontend fix that only works in the UI but not when a user is updating policy automations using fleetctl.
Original backend ticket: https://github.com/fleetdm/fleet/issues/7808 API built for reseting failing policies to reset to check hosts again: https://github.com/fleetdm/fleet/blob/main/docs/Using-Fleet/REST-API.md#run-automation-for-all-failing-hosts-of-a-policy
@RachelElysia OK, I believe I get the idea.
Q: Do we still want (need) to keep /api/latest/fleet/automations/reset
as part of our public API?
@lucasmrod If you have time - prepare to have this estimated for tomorrow?
Goal
As a Fleet user that just turned on automations for a policy, I expect all failing hosts to be included in a webhook payload. This way, I can send emails to all my hosts and notify end users to remediate.
Problem
Today, if a user turns on automations for a policy, and several hosts are already failing, these hosts will not be included in a webhook requests (or ticket).
An API endpoint is available to run automations for previously failing hosts at
POST /fleet/automations/reset
. However, we are unsure when this endpoint should be called, and with what parameters. Need clarity before moving forward with frontend implementation.Related
7808 (backend)