fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.12k stars 431 forks source link

Update host MDM profile status to pending in response to triggering events #10122

Closed lukeheath closed 1 year ago

lukeheath commented 1 year ago

Goal

As a Fleet admin, I want to know the current status of MDM profiles applied to my hosts. User-driven events, such as assigning hosts to teams, adding new MDM profiles, or deleting old MDM profiles, etc., trigger a change in the MDM profile statuses of the associated hosts. The UI should reflect these changes as soon as possible.

Problem

Currently, the MDM statuses of hosts are updated on a 30-second cron interval. The backend handles MDM profile changes asynchronously via external Apple APIs and batches these into cron jobs. "Pending" records are created at the time the async request is made to the Apple API. This can result in an undesirable UX due when the status does not match user expectations. For example, a user adding a new config profile to a team would expect the MDM profile status of associated hosts to be "pending" immediately after adding the profile to the team. But it may take up to 30 seconds for that change to be reflected in the UI currently.

Proposed solution

Implement service methods to create new "pending" entries to host_mdm_apple_profiles that can be called at the time of the initiating event.

For example, in the case of assigning hosts to a team, the necessary steps for each host are:

For example, in the case of adding a config profile, the necessary steps are:

These entries would include not include the command_uuid, which would need to be populated later by the cron job.

Next steps

Note that add/delete/edit profiles can be done in batch via fleetctl apply, and moving hosts to a different team is also done in batch (multiple hosts can be moved in a single call).

To clarify: how about when a host get deleted from fleet? Unenrolled? When a team gets deleted? My (Martin) guess is that we don't do anything regarding profiles on those actions? And for "host initial enrollment", we're talking mdm enrollment (not fleet/orbit/osquery), right?

I took a bit of a deeper dive in team deletion, it looks like we don't update the hosts' team_id field, so what would probably happen the next time the cron job runs for profiles is that all those hosts' profiles would be identified as "to remove" (which might be what we want?), but it wouldn't consider those hosts as part of "no team" because their team_id hasn't been updated. I'll keep digging to see if we ever update that field via some cron job somewhere. Nevermind, we still have a foreign key on hosts->teams that sets the host's team_id to NULL on team deletion. So I think we need to care about team deletion as a trigger to update the mdm profile statuses.

lukeheath commented 1 year ago

Hey team! Please add your planning poker estimate with Zenhub @gillespi314 @mna @roperzh

mna commented 1 year ago

@lukeheath @noahtalerman left some questions in the ticket's spec under "To clarify", when y'all have a moment (nothing urgent/blocking).

lukeheath commented 1 year ago

@mna

To clarify: how about when a host get deleted from fleet? Unenrolled? When a team gets deleted? My (Martin) guess is that we don't do anything regarding profiles on those actions? And for "host initial enrollment", we're talking mdm enrollment (not fleet/orbit/osquery), right?

Yes, I would assume the same, so I think that's a same assumption moving forward unless Noah says otherwise.

So I think we need to care about team deletion as a trigger to update the mdm profile statuses.

Agreed. If a host's team is deleted, the host moves to the "No team" group and has the "No team" profiles applied.

lukeheath commented 1 year ago

@roperzh I'm assigning this to you as your next priority after the file encryption orbit work. If we can wrap it up this week, that's great, but if not we can defer to next week and Martin can finish it.

mna commented 1 year ago

@lukeheath @roperzh Just a heads-up that I'm taking this back to work on it this week.

lukeheath commented 1 year ago

@mna Sounds good, thank you! Please do. We decided to defer the profile aggregates story to this release.

mna commented 1 year ago

@lukeheath @roperzh @gillespi314 The PR for this is ready (https://github.com/fleetdm/fleet/pull/10443), it is quite big - sorry about that! It required some tweaks to the existing behaviour to handle setting the status to pending ("NULL") immediately, and then have the cron job pick it up and the stats updated to consider those rows as pending.

While testing the changes for this ticket, I found a couple bugs (well, more like one question and one bug):

  1. Currently, we don't count a host with no profiles as part of any category when reporting the hosts profile summary. This may be ok, but if you think about the case where a host had a profile, and that profile got removed, then that host would go from "profile applied" to "pending (due to removal)", finally to not being listed in the stats at all. I was wondering if those hosts should be reported as "applied" instead (as in, fully up-to-date with the desired state regarding profiles). If so that would require some changes in the query for the stats (mentioned here: https://github.com/fleetdm/fleet/pull/10443/files/9f0322f86970652a529f474b1ff4fb6ac62a8989#diff-2f8a6d69f21403d6845da98a6a9a7e3808f275a21ec0209cbabcdbeaae57676eR1334 and here: https://github.com/fleetdm/fleet/pull/10443/files/9f0322f86970652a529f474b1ff4fb6ac62a8989#diff-ec797a071df046dfb849880d689b5dc274601d19626e17e2f61e6ce663adaea9R865 ).
  2. More critical, I think, is that issue: https://github.com/fleetdm/fleet/pull/10443/files#r1143922829 . When editing profiles via fleetctl apply, a profile that had its content changed (but same profile identifier as before) generates internally a new database profile ID, meaning that hosts that had this profile applied would now be pending a) removal of the old profile ID and b) install of the new profile ID. However, on the host, that profile has the same "identifier" (which is different than our internal profile ID). It is then a race to determine if the Install of the new version of that profile runs before the Remove, and if so it is a bug (remove doesn't know about the content, it only asks MDM to remove the profile with identifier "X", so if the new one got installed first, then it would happily remove it).

@lukeheath Do you want me to create tickets for those issues, or do you prefer to create them yourself to properly organize them in the boards? If you prefer to create them, let me know when it's done and I'll edit them to add the details.

lukeheath commented 1 year ago

@mna Thanks for calling these out!

For item 1, let's check with @noahtalerman and get his thoughts. Your suggested change makes sense to me.

For item 2, would you please file that as a bug and bring it on to the release board? Thank you!

lukeheath commented 1 year ago

Noah: Re item 1, no we don't want them counted in the profiles summary if there are no profiles applied.

fleet-release commented 1 year ago

MDM profiles shift, Pending status now in sync, Nature's ebb and flow.