fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.01k stars 418 forks source link

Add fleetctl command to retrieve list of recent MDM commands #11008

Closed lukeheath closed 1 year ago

lukeheath commented 1 year ago

User story

As an IT admin, I want to be able to retrieve a list of recent MDM commands so that I can retrieve the command ID and retrieve the results of that command

Requirements

Design

CLI usage

Add a fleetctl get mdm-commands command
$ fleetctl get mdm-commands
Errors
mna commented 1 year ago

@noahtalerman There's currently no pagination/limit provided as argument to the command, and the history of executed commands will keep growing (AFAIK we never delete the list of commands at the moment, though we definitely should at some point).

How about I limit the output to the 1000 most recent commands (most recent first)?

mna commented 1 year ago

@noahtalerman As you recall from the previous discussion, an MDM command may target multiple hosts. With this in mind, the output cannot really be as spec'd:

ID (command id), TIME, TYPE, STATUS, and HOSTNAME columns.

Namely, the status and hostname columns are host-specific, so a single command could have multiple values for those columns. Should I just output ID, TIME (of the command's execution - when it was enqueued, it will have different timestamps for actual execution on the affected hosts), and TYPE?

noahtalerman commented 1 year ago

the status and hostname columns are host-specific, so a single command could have multiple values for those columns

@mna hmm, let's say Fleet runs an InstallProfile command on 3 hosts.

Does this^ create 3 entries in the commands table in our DB? It sounds like there's 1 entry in the DB with references to 3 hosts (or something like that)...

If it's the latter, in this scenario, my understanding is if the user runs fleetctl get-mdm-command-results they see 3 rows (one for each host) with ID (command id), TIME, TYPE, STATUS, HOSTNAME, and RESULTS columns.

Is that right?

and TYPE?

@mna regarding TYPE, this is the name of the command (Apple's name). This would be InstallProfile in the above scenario.

mna commented 1 year ago

@noahtalerman Ok, yes we can list one row per command-host combination, I thought we wanted one row per command which is why I mentioned this alternative output.

In case you missed it in your notifications, I also have this question regarding limiting the output: https://github.com/fleetdm/fleet/issues/11008#issuecomment-1503313755

noahtalerman commented 1 year ago

we can list one row per command-host combination

@mna yes we want this.

Then, if the user runs fleetctl get mdm-command-results --id=<command-id> they'll only see results for the one command-host combo.

My understanding is that this approach makes it likely that user run commands (ex. lock) will be buried in a list of Fleet run commands (like pushing profiles). I think we can solve this in a later story (see below).

This approach leads me to these question:

Regarding limiting the output, I think 1,000 most results commands is reasonable. For my understanding, is this driven by a performance concern or a UX concern?

mna commented 1 year ago

@noahtalerman

My understanding is that this approach makes it likely that user run commands (ex. lock) will be buried in a list of Fleet run commands (like pushing profiles). I think we can solve this in a later story

:+1: I agree.

how difficult will it be to add filtering

I think that'd be a great feature. Shouldn't be too complex, I'll make sure the initial implementation makes that reasonably simple to add later.

how difficult will it be to aggregate commands to one command to multiple hosts later?

You mean to output only one row per command and drop the host-specific columns? It'd probably be a bit more complex, but again, nothing that can't be done in a reasonable number of points estimate (I'd guess < 5).

I think 1,000 most results commands is reasonable. For my understanding, is this driven by a performance concern or a UX concern?

I'd say both, as there's diminishing value to have to scroll through endless rows, but yeah mostly driven by performance - we don't want to hit the DB with a query scans thousands and thousands of rows, and that in turn results in a huge payload to return via the API call, so it hurts the performance of the system in multiple ways (DB, Fleet server, network).

noahtalerman commented 1 year ago

mostly driven by performance - we don't want to hit the DB with a query scans thousands and thousands of rows, and that in turn results in a huge payload to return via the API call, so it hurts the performance of the system in multiple ways (DB, Fleet server, network).

@mna got it thanks 👍 returning the 1,000 latest commands sounds good to me.

output only one row per command and drop the host-specific columns?

That's right.

Re adding features later: good to hear current approach won't make it too difficult to come back to these later.

mna commented 1 year ago

@lukeheath @georgekarrv just a heads-up (for planning purposes) that this is likely more a 3 or 5 than a 2 (in addition to the CLI command itself, it needs a new API endpoint, authorization is tricky for mdm commands, a new service and datastore method, documentation for the new endpoint, and tests for all those layers + integration).

mna commented 1 year ago

@noahtalerman heads-up that I added "observer" to that list, as we've decided that in fleetctl get mdm-command-results we wanted to allow observers to view results, I assume we want that too for listing commands:

Lists the commands the user has permission to view. If the user is a team maintainer or team admin or team observer, only show the list of commands that were run on hosts that belong to the team(s) they're on

noahtalerman commented 1 year ago

I added "observer" to that list, as we've decided that in fleetctl get mdm-command-results we wanted to allow observers to view results, I assume we want that too for listing commands

@mna agreed! Makes sense to me.

mna commented 1 year ago

@noahtalerman @gillespi314 Follow-up on what we discussed during scrum, for commands without results yet, nanomdm stores them in a queue (different table than the results), so to list them as part of the command output, the query needs to be a bit more complex, I'll update the PR.

mna commented 1 year ago

ooh there's a view that exists in nanomdm that makes that query much nicer!

mna commented 1 year ago

@noahtalerman @gillespi314 manually tested with the latest change to the PR, enqueuing an MDM command on an offline host properly lists the command (with empty status):

$ ./build/fleetctl get mdm-commands
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
|                  ID                  |         TIME         |           TYPE           |    STATUS    |        HOSTNAME        |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
| 024fb3b9-cd8a-40a6-8dd7-6c155f488fd1 | 2023-04-12T18:16:32Z | InstalledApplicationList |              | Martins-iMac-Pro.local |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
| 87dc6325-8bc0-4fc8-9a2f-3901c535456e | 2023-04-12T18:15:01Z | InstallProfile           | Acknowledged | Martins-iMac-Pro.local |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+

(and properly updates once the host comes back online):

$ ./build/fleetctl get mdm-commands
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
|                  ID                  |         TIME         |           TYPE           |    STATUS    |        HOSTNAME        |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
| 024fb3b9-cd8a-40a6-8dd7-6c155f488fd1 | 2023-04-12T18:19:10Z | InstalledApplicationList | Acknowledged | Martins-iMac-Pro.local |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
| 87dc6325-8bc0-4fc8-9a2f-3901c535456e | 2023-04-12T18:15:01Z | InstallProfile           | Acknowledged | Martins-iMac-Pro.local |
+--------------------------------------+----------------------+--------------------------+--------------+------------------------+
noahtalerman commented 1 year ago

ooh there's a view that exists in nanomdm that makes that query much nicer!

Woo!

enqueuing an MDM command on an offline host properly lists the command (with empty status)

@mna nice. Is the status actually empty? Does nanomdm not give us a status? It looks like the MDM protocol has an Idle status for when the device is offline:

Screenshot 2023-04-13 at 1 13 13 PM

I think the status in our command's output should be one of the above, right?

mna commented 1 year ago

@noahtalerman The Idle status exists at the MDM protocol level, but nanomdm explicitly ignores it (it doesn't store that status when it receives an "Idle" response), so the empty status is its equivalent of "no status" (for reference: https://github.com/fleetdm/nanomdm/blob/main/storage/mysql/queue.go#L108-L110).

Of course we can decide that if the status is empty at the database level, we return it as "Idle" in the command's output and API endpoint's response if you think that's what the user would expect to see.

noahtalerman commented 1 year ago

the Idle status exists at the MDM protocol level, but nanomdm explicitly ignores it

@mna oh interesting...do you know why?

I think the user would expect to see the MDM protocol level status. So, in this "no status" case I think it makes sense to display Idle.

Somewhat related: Are we displaying error messages (or storying them at database level) ?

I noticed that the specs don't specify what to do with the error details for a command: Screenshot 2023-04-17 at 10 51 22 AM

gillespi314 commented 1 year ago

These are the values reported by the device (i.e. from the perspective of the device rather than the server) so I believe that Idle means that the device is online and had nothing to report when it checked in. An offline host would never report results.

mna commented 1 year ago

@noahtalerman

do you know why?

No, it seems to have been there since the initial implementation of nanomdm. Maybe just as a simplification step, since "Idle" does not report anything more than the empty status AFAICT.

I think the user would expect to see the MDM protocol level status. So, in this "no status" case I think it makes sense to display Idle.

Ok, I'll update the PR to return "Idle" when the status is empty. EDIT: let's wait and confirm that's actually what we want, @gillespi314 makes a good point in the comment above.

Are we displaying error messages (or storying them at database level) ?

The nanomdm package doesn't do anything with the error chain, nor does it store it in the db. From what I can see, in Fleet we do store the error chain in the details field of host_mdm_apple_profiles for InstallProfile and RemoveProfile commands only.

gillespi314 commented 1 year ago

For the bootstrap package (which is an InstallEnterpriseApplicaton command under the hood), I believe we are reporting the empty status as "Pending"

noahtalerman commented 1 year ago

For the bootstrap package...I believe we are reporting the empty status as "Pending"

@mna @gillespi314 yep. For the bootstrap package UI/UX, we're adding adding a layer of abstraction, on top of the MDM protocol, to make it easier for the IT admin to understand what happened.

For the MDM commands UX, we want to display the raw status/output we get from the MDM protocol. So "Acknowledged," "Error," "CommandFormatError," "Idle," and "NotNow" for statuses.

noahtalerman commented 1 year ago

The nanomdm package doesn't do anything with the error chain, nor does it store it in the db.

@mna ok, good to know. How difficult would it be to display errors? I think we want to address this in a later pass.

gillespi314 commented 1 year ago

@noahtalerman I think that still leaves us with the question of what to report for an offline host (the empty case) nothing has been reported back from the device yet.

noahtalerman commented 1 year ago

still leaves us with the question of what to report for an offline host (the empty case) nothing has been reported back from the device yet

@mna hmm, for the fleetctl get mdm-commands response, we would show "Idle" for this case right?

mna commented 1 year ago

@noahtalerman

How difficult would it be to display errors? I think we want to address this in a later pass

Ideally this would be at the nanomdm level so that it is stored as part of storing the command results. This would mean our fork would diverge a bit further from upstream, don't know how hard we want to avoid this. But assuming we'd be ok with that, it'd be an easy thing to address, and we could at the same time store the "Idle" status properly if we wanted to, instead of ignoring it.

Regarding the status to display (empty vs Idle), as @gillespi314 mentions above, those are semantically different statuses - empty is the status that is initially assigned to a host for a command that it never received yet, while "Idle" is returned by the host once it has been asked to execute it (though I'm not sure why it'd return "Idle" and not actually execute the requested command? but regardless...).

I don't know if it's an important distinction from a user's perspective, though? In both cases, we haven't received a response. But if we do want to treat them differently, then it's a problem, as nanomdm ignores the "Idle" status atm.

noahtalerman commented 1 year ago

During MDM standup on 2023-04-17 we decided to display "pending" status when a command in fleetctl get mdm-commands is hasn't run yet.

Options considered:

Reasoning:

cc @gillespi314 @georgekarrv @mna

ghernandez345 commented 1 year ago

QA looks good. tested with multiple commands and roles.

noahtalerman commented 1 year ago

Hey @mna when you get the chance, can you please write docs for how to get a list of recent MDM commands.

Here's a good place for these docs: https://fleetdm.com/docs/using-fleet/mdm-commands#custom-commands

fleet-release commented 1 year ago

Fleetctl commands, MDM list now retrieved, Clarity for all.

zhumo commented 1 year ago

C&C: Needs documentation in guide. Need to show how you can get a list of mdm commands that was sent.

noahtalerman commented 1 year ago

Hey @mna! Reminder for docs. When you get the chance, can you please help write docs for how to get a list of recent MDM commands.

Here's a good place for these docs: https://fleetdm.com/docs/using-fleet/mdm-commands#custom-commands

mna commented 1 year ago

@noahtalerman Apologies and thanks for the reminder! This fell off my radar, lost in the github notifications emails. Will prepare a PR today.

mna commented 1 year ago

@noahtalerman documentation PR is ready for review here: https://github.com/fleetdm/fleet/pull/12006

noahtalerman commented 1 year ago

Docs for this feature are merged! Here's the PR: https://github.com/fleetdm/fleet/pull/12006

fleet-release commented 1 year ago

Fleetctl gathers commands, A gentle stream of knowledge, Admins sail with ease.