fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.69k stars 382 forks source link

Prioritize lock & wipe over other queued activities #19534

Open nonpunctual opened 1 month ago

nonpunctual commented 1 month ago

State:

Scenarios:

  1. Issue lock command / script:

    • other scripts may be queued ahead of the lock script
    • because of this the unlock script does not run immediately
  2. Lock script can prevent other queued scripts from completing

    • In this case currently admins must manually update each script in the queue with a "fake" exit code
    • This causes Fleet to recognize queued scripts as having been executed & completed

This issue is related to: #17150 - For remote lock capability on Windows to shut down the machine, add manage-bde forcerecovery

Problem

Windows lock & wipe commands are not prioritized over other queued Fleet activities

Potential solutions

  1. Develop Windows lock & wipe commands as MDM with full macOS feature parity
  2. Allow Fleet admins to designate a priority order for Fleet activities, or, at least for scripts.
  3. Set Windows lock & wipe commands to automatically jump to top of Activity queue.
noahtalerman commented 1 month ago

Thanks for tracking this @nonpunctual!

We'll weight it at the upcoming feature fest on 2024-06-06.

noahtalerman commented 3 weeks ago

Noah: There's new workflow for this when we ship cancel scripts because the product could cancel all pending scripts if lock is issued.

cc @nonpunctual

valentinpezon-primo commented 3 weeks ago

@noahtalerman @nonpunctual

Hmm it means we should have a way to know which script to cancles, aka which script are still in queue, idk if it's easily doable right now ?

Also, in a case where you want to Lock the device, you may want to still execute the scripts that were in queue beforehands, that would mean :

Kinda painfull ..

Having security command (aka lock and wipe) at the top of the queue everytime would solve all the above issue imo, also it would be in pair with Apple way of doing things, which make sense for MDM solution I think 🤔

samleb commented 3 weeks ago

I concurr 👍 Having a cancel endpoint is a feature that could be of some interest, but that does not solve the issue we were mentioning here. Locking/wiping is something that you may have a do in a critical/urgent situation, which sounds completely incompatible with canceling enqueued scripts (not even mentioning the point @valentinpezon-primo is making of having to reschedule them later on).

This problem reminds me of how some queuing systems are designed in critical environments:

Having something similar (maybe only a critical priority for lock/wipe) sounds like the most decoupled way of dealing with the issue at hand.

noahtalerman commented 3 weeks ago

Thanks for the feedback @valentinpezon-primo and @samleb!

Having security command (aka lock and wipe) at the top of the queue everytime would solve all the above issue imo, also it would be in pair with Apple way of doing things, which make sense for MDM solution I think

I agree it would make sense it's a matter of time before we get to it.

we should have a way to know which script to cancles, aka which script are still in queue, idk if it's easily doable right now ?

In the meantime, when we ship the ability to cancel scripts, this is the API endpoint to get a list of upcoming activities (includes queued scripts) for hosts: https://fleetdm.com/docs/rest-api/rest-api#get-hosts-upcoming-activity.

I think there will be some API endpoint to cancel scripts by execution uuid.

noahtalerman commented 3 weeks ago

cc @marko-lisica ^

zayhanlon commented 2 weeks ago

internal note: cancel scripts doesn't work for customer-preston because it would require a full build of a cancel flow in their product as well

nonpunctual commented 2 weeks ago

Thanks @zayhanlon

@noahtalerman @marko-lisica another way of saying this is that the customer would have to build the logic of:

This isn't impossible, but, it's a heavy lift. Because we are creating the lock & wipe events in the activity feed it does seem like it would be easier to me to just put in a rule that those events always get set as the next action before anything else when issued. I don't think there is a use case where admins would not want that to be the behavior.

noahtalerman commented 1 week ago

Thanks for relaying the feedback @zayhanlon and @nonpunctual.

Makes sense.

noahtalerman commented 1 week ago

Windows lock & wipe commands are not prioritized over other queued Fleet activities

Hey @mna just checking, do we already have the ability to prioritize specific scripts/MDM commands over others in the queue? Or is queue orchestration still a TODO?

mna commented 1 week ago

@noahtalerman we don't, it's still a TODO, the queue is currently "read-only" (cannot cancel nor prioritize) and order is not enforced in-between different "action types" (e.g. scripts vs software installs).