fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.7k stars 384 forks source link

Automatically run script on policy failure #17129

Open dherder opened 4 months ago

dherder commented 4 months ago

Goal

User story
As a Fleet user,
I want Fleet to automatically run a script on a host when it fails a policy
so that I can automate host compliance w/o having to use a third-party automation tool (ex. Tines).

Context

Changes

Product

Engineering

ℹ️  Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Manual testing steps

  1. Step 1
  2. Step 2
  3. Step 3

Testing notes

Confirmation

  1. [ ] Engineer (@____): Added comment to user story confirming successful completion of QA.
  2. [ ] QA (@____): Added comment to user story confirming successful completion of QA.
noahtalerman commented 4 months ago

I would like to execute a script automatically when a policy fails instead of trigger a webhook.

@dherder we'll get to this but I think there's an iteration or two before we build it.

Currently, the customer can consume the failing policies webhook in Tines and execute a script using the Fleet API, right?

I think the first iteration will be sending a webhook per host that includes all the hosts failing policies. I think this simplifies the Tines story. The Tines story becomes this:

  1. Receive new webhook that includes a specific host's failing policies
  2. Loop through policies and take remediation action specific to each failing policy (via script or some other tool)
dherder commented 4 months ago

@noahtalerman would also be good to get a Fleet desktop notification on failed policies similar to https://github.com/fleetdm/fleet/issues/16264

noahtalerman commented 4 months ago

would also be good to get a Fleet desktop notification on failed policies

@dherder the current plan is to solve the problem of notifying the end user by getting in their calendar: #17230

dherder commented 4 months ago

@noahtalerman I see the calendar remediation as a separate issue. It works great when you want an end user to do a thing like update an app or perform an OS update. Where it doesn't work so great is if you want the remediation to be "execute a root level script", where if the user is a standard user, they just simply wouldn't be able to do it.

noahtalerman commented 4 months ago

Where it doesn't work so great is if you want the remediation to be "execute a root level script", where if the user is a standard user, they just simply wouldn't be able to do it.

@dherder I think the first iteration of "Fleet in your calendar" will address this.

The high level flow of the feature:

  1. IT admin chooses which policies trigger calendar events
  2. Calendar event is created when end user fails at least one of these policies
  3. Webhook is fire when the calendar event starts
  4. Automation tool (ex. Tines) receives the webhook and runs atuo-remediation (ex. script)

Check out the user story for more details on the flow: https://github.com/fleetdm/fleet/issues/17230

What do you think?

Also, we didn't have room for this "Auto remediation of policy failure" story in the current design sprint (4.48).

nonpunctual commented 2 months ago

@noahtalerman it's still does not solve the problem of 3rd party solution integration that is a blocker for some of our current customers but especially prospective customers.

The expectation is that if Fleet has the script server-side & Fleet has a policy to check for a client state or attribute, that it would also have a way of executing the script on a policy failure without 3rd party integration required.

Couldn't Fleet just send the policy failure webhook to its own API endpoint for executing a script? Is there a technical concern like load on server due to script execution? Thanks.

cc @dherder @willmayhone88 @spokanemac @ksatter @pacamaster

dherder commented 2 months ago

@noahtalerman i presented the option of remediation through 3rd party automation tools today (IT buying scenario) and the feedback was that it would be a blocker to move forward with Fleet.

noahtalerman commented 2 months ago

Couldn't Fleet just send the policy failure webhook to its own API endpoint for executing a script? Is there a technical concern like load on server due to script execution? Thanks.

@nonpunctual no technical concern that I know of. It's just a matter of priorities/timing. Let's chat about it at feature fest!

nonpunctual commented 2 months ago

csa:20240530

spokanemac commented 1 month ago

I'd like to see something like this with a drop-down next to each policy.

image

noahtalerman commented 3 weeks ago

Hey @dherder I updated this issue to user story format and moved your original issue description below for safekeeping. cc @marko-lisica


Problem

When a policy fails, Fleet can currently consume a webhook and send a response about the failures of the policy. Fleet can also provide guidance for the end user when a policy fails via Fleet Desktop.

Since we now have script execution capabilities, as an IT admin, I would like to execute a script automatically when a policy fails instead of trigger a webhook.

Potential solutions

In the automations dialog, have an extra option to "Run script".

marko-lisica commented 1 week ago

Hey @zayhanlon and @dherder, we're dropping this one. The plan is to bring this one to the design sprint after the next. For more context see this doc.