Closed lukeheath closed 1 year ago
@roperzh Would you please add high-level specs to this ticket for estimation today? Thanks!
Hey team! Please add your planning poker estimate with Zenhub @gillespi314 @mna @roperzh
@roperzh @gillespi314 Do you have more details about what needs to be done for this ticket? Looking at apple's docs, I see there's a Define profile API endpoint but it's not clear if that's the right one/how we can make it so that it applies for a given team for us, and there's Assign profile but we need a list of host serial numbers, not sure how it can apply at setup time for a new/as of yet unkown device?
Another thing that I'm not sure how we want to handle that is if there's no custom setup assistant for a team/no team, what is the default profile? I'm assuming that when a custom setup assistant is removed, we'd need to re-enable (via some API call to Apple) the default one?
Trying to figure out if this ticket could make it in this release (as per the slack discussion: https://fleetdm.slack.com/archives/C03C41L5YEL/p1682429960085959
/cc @lukeheath @georgekarrv @noahtalerman
I do not. I was going to ask about this at standup.
@mna sorry for not including more details here. The way this currently works is:
apple_mdm_dep_profile_assigner
) that fetches a list of newly added devices from ABM (this list includes the serial number of each device)A good chunk of this is currently managed by nanomdm
, so we'll need to copy/paste + tweak some of the logic, as my understanding is that nano assumes that there's only one DEP profile (the "assigner" profile)
Generally we need to:
profile_uuid
that we should store.2
above to make sure there's a default profile for the default ABM team.AppConfig.MDM.AppleBMDefaultTeam
using the assign profile API you mentioned.Let me know if this makes sense.
my understanding is that nano assumes that there's only one DEP profile (the "assigner" profile)
Yeah, looking at that code it seems like it would support multiple profiles via different "DEP names", but I'm not sure of all the implications. In Fleet we use a single static DEP name for all operations. I think your suggestion to copy/paste/tweak the relevant parts is a good approach, we can fully control what happens there.
@roperzh Trying to make sure I correctly understand what's going on and what we need to do (sorry if I'm missing anything obvious, the Apple's developer docs website is failing with Forbidden at the moment so there are a number of API-related details I cannot check). There are multiple parts coming together for this, multiple abstractions (DEP syncer, DEP assigner, storage, etc.) and it's not always clear what's the flow of devices/new hosts through all this.
My understanding for this ticket is that we have optional custom "macos setup assistant" per team/no team. Those "setup assistant" define an enrollment profile, right? And that enrollment profile is applied exactly once to a device, even if that device moves to a different team post-DEP-enrollment? That task number 2 in the spec desc made me unsure about that:
Call ABM/DEP API to assign a profile to the host when the host's team changes
With this in mind, currently the cron apple_mdm_dep_profile_assigner
would:
// If the DEP Profile was changed since last sync then we clear
// the cursor and perform a full sync of all devices and profile assigning.
if cursor != "" && profileModTime.After(cursorModTime) {
d.logger.Log("msg", "clearing device syncer cursor")
if err := d.depStorage.StoreCursor(ctx, DEPName, ""); err != nil {
return err
}
}
Does that mean that if the profile is modified, we re-assign it to hosts that were already enrolled? This challenges my assumption that we assign exactly once a profile to a host.
Then we go on and run the nanodep "syncer", which will for all (new?) devices, call our callback which ingests the device into Fleet (creates the hosts
entry).
Finally we call nanodep's assigner.ProcessDeviceResponse
which makes the Apple API call to AssignProfile
to the devices.
I think my main point of confusion is around applying once for newly ingested devices vs does it apply multiple times on update / does fetch/sync devices list only new ones or all (even already enrolled).
Now, assuming we only apply once and never update an applied enrollment profile, what I had in mind for this ticket:
DefineProfile
when a new/updated setup assistant is applied to a team/no team, store that profile UUID.syncer
callback that ingests the new device (so, after the hosts
rows are created, with the correct team_id
), call AssignProfile
with the correct team-specific profile (or the default one if there is no specific setup assistant).Given that logic, I don't see how a device could be moved to a different team before it gets a profile assigned (that is, the cron ingestion uses the AppleBMDefaultTeam and immediately after we call AssignProfile
with the profile corresponding to that team). So I'm not sure how/where this should happen:
When a device is "pending" and is transferred to a different team, we assign the profile associated with this new team using the assign profile API you mentioned (making sure there's a profile in place)
@mna the MDM solution (us), Apple, and the authorized reseller deal with the same database through different APIs, at a high level:
xyz
belongs to org x
)x
creates (one or many) "enrollment profiles" in this "shared database", which can be referenced by ID.x
pings the list/sync devices APIs periodically to get devices assigned to the org by the reseller.x
can assign a profile created in 2
to machine with serial number xyz
.Based on this workflow:
3
, 5
) above (stops being pending as soon as 5
happens)xyz
as many times as you want, this assignment will have an effect the next time the device fetches the DEP enrollment profile from apple's servers. This can happen:
sudo profiles renew --type enrollment
(not as important)Now to actually answer your questions:
My understanding for this ticket is that we have optional custom "macos setup assistant" per team/no team. Those "setup assistant" define an enrollment profile, right?
This is true, they are effectively the same thing.
And that enrollment profile is applied exactly once to a device, even if that device moves to a different team post-DEP-enrollment?
Hopefully the intro above cleared this up, I think the confusion was around the difference between "profile assignment" and "the profile that's effectively used during unboxing", but let me know if that's not the case.
List the devices (from a cursor, but if the cursor is exhausted, it does a full sync - doesn't that get all devices assigned to DEP, regardless of whether they were already enrolled? Don't we assume that devices fetched here are "new/unenrolled" devices?)
To be honest, I don't know why we're doing that if the cursor is exhausted, worth double checking if that's indeed the behavior we want.
- We ensure there is an existing default profile defined in Apple DEP This section confuses me:
// If the DEP Profile was changed since last sync then we clear // the cursor and perform a full sync of all devices and profile assigning. if cursor != "" && profileModTime.After(cursorModTime) { d.logger.Log("msg", "clearing device syncer cursor") if err := d.depStorage.StoreCursor(ctx, DEPName, ""); err != nil { return err } }
Does that mean that if the profile is modified, we re-assign it to hosts that were already enrolled? This challenges my assumption that we assign exactly once a profile to a host.
yes, exactly in the current version (where we have a single profile,) if the profile changed it gets reassigned to all devices. I think it makes sense:
sudo profiles renew --type enrollment
get the updated profile (not as important)I imagine we can cover this same scenario by making the call when the profile is actually changed, but we can keep doing it in the cron as well!
Now, assuming we only apply once and never update an applied enrollment profile, what I had in mind for this ticket:
* Call `DefineProfile` when a new/updated setup assistant is applied to a team/no team, store that profile UUID. * During cron, after the `syncer` callback that ingests the new device (so, after the `hosts` rows are created, with the correct `team_id`), call `AssignProfile` with the correct team-specific profile (or the default one if there is no specific setup assistant).
Given that logic, I don't see how a device could be moved to a different team before it gets a profile assigned (that is, the cron ingestion uses the AppleBMDefaultTeam and immediately after we call
AssignProfile
with the profile corresponding to that team). So I'm not sure how/where this should happen:When a device is "pending" and is transferred to a different team, we assign the profile associated with this new team using the assign profile API you mentioned (making sure there's a profile in place)
With the information presented in the preamble of this wall of text (sorry!) your plan makes sense to me with the addition of also updating the profile when the host switches teams, which hopefully makes more sense now.
Please let me know if something is still unclear! I think another bit to untangle is the fact that we have a table from the prototype keeping track of what's now stored in the setup assistants table. I can explain my current understanding of that too, just lmk!
@roperzh Thank you, that's super helpful. The fact that AssignProfile
basically just stores in Apple's servers what will be deployed to the devices if/when they request it is a major point that I was missing (i.e. the sudo profiles renew --type enrollment
scenario). In light of that, it makes total sense to AssignProfile
to all relevant hosts when that profile is changed, and it may or may not be deployed onto those hosts in the future (but it will for new hosts pending unboxing).
And that explains the requirement to assign the new team's profile if a host is moved to a different team. It will make the cron job a bit more tricky, but at least the big picture is much clearer now, thanks!
@roperzh
I think another bit to untangle is the fact that we have a table from the prototype keeping track of what's now stored in the setup assistants table. I can explain my current understanding of that too, just lmk!
I'd certainly love to hear that, when you have a moment! That's the mdm_apple_enrollment_profiles
table? I deliberately avoided using that table for macos setup assistant
as it was being used already both in parts that we may want to delete in the future (obsolete/unused code) and in code that is still used (createProfile
).
@mna glad it helped, and sorry again for the wall of text.
I'm curious:
It will make the cron job a bit more tricky, but at least the big picture is much clearer now, thanks!
can we make the API call to assign the profile when the host is transferred? this way the cron only assigns to the default team, and it doesn't have to check if the hosts changed teams since the last assignment in the cron.
@roperzh
can we make the API call to assign the profile when the host is transferred?
Probably, yes.
@mna Thinking about it more, doing it during transfer might make things more tricky: what if the network request to the Apple server fails? do we have to rollback the team assignment and fail the request then? with the cron at least we know we'll try again next time.
I'd certainly love to hear that, when you have a moment! That's the
mdm_apple_enrollment_profiles
table? I deliberately avoided using that table formacos setup assistant
as it was being used already both in parts that we may want to delete in the future (obsolete/unused code) and in code that is still used (createProfile
).
@mna yes! it's that table. That makes a lot of sense to me!
for context, we currently have three ways of serving profiles:
3
works using the mdm_apple_enrollment_profiles.token
column, the logic is a bit confusing but:
mdm_apple_enrollment_profiles.dep_profile
is a JSON column containing the DEP profile (exactly the same as mdm_apple_setup_assistants.profile
)configuration_web_url
and url
, both contain an URL to serve the profile: <server_url>/api/mdm/apple/enroll?token=<token>
mdm_apple_enrollment_profiles.token
<server_url>/api/mdm/apple/enroll?token=<token>
, we just check if there's an entry in mdm_apple_enrollment_profiles
with a matching token as a way to "authenticate" the request, and we serve an MDM enrollment profile (the XML profile). We only use the data in mdm_apple_enrollment_profiles
to validate the token.As we move to mdm_apple_setup_assistants
I think the last bit to untangle is:
3
@roperzh I'm curious, given that we provide the full profile in the call to DefineProfile
, which creates an association between the provided profile and a profile UUID, and then we associate that profile with a list of devices in AssignProfile
, do you know why we still need to expose an URL to serve the profile, vs apple providing it since we sent the payload to DefineProfile
?
I see that the endpoint that serves the profile actually serves an XML profile, so that may be the reason, but I'm not sure what's the difference between the JSON that we send to DefineProfile
and what we end up serving in that token-authenticated URL that is stored in the JSON profile, except for a format change? Looking at the code in server/mdm/apple/apple_mdm.go
(GenerateEnrollmentProfileMobileconfig
), the XML seems very different/somewhat unrelated to the JSON definition of the profile? Is that XML still good for the custom setup assistants? I see a lot of uuids/values being hard-coded in there.
@mna your assessment is correct, I think the problem is that the mdm_apple_enrollment_profiles
is "overloaded" which makes this a bit confusing.
The JSON profile is used to:
The XML profile is used to:
I think the expected outcome is:
1
as the value for configuration_web_url
and url
in the JSON profile (macos_setup_assistant) if MDM SSO is not configured.@mna Thinking about it more, doing it during transfer might make things more tricky: what if the network request to the Apple server fails? do we have to rollback the team assignment and fail the request then? with the cron at least we know we'll try again next time.
Keeping it in the cron to cover this scenario makes sense to me.
I see a lot of uuids/values being hard-coded in there.
I'm curious about this also. Intuitively, it seems like at least the UUID in the top-level <dict>
should be different between the XML for different teams.
@roperzh @gillespi314 Thanks for your input. To recap what has been discussed so far and start moving on to an implementation plan (or at least define a spec to target), this is what needs to happen AIUI (let me know if I'm missing anything, whenever I say "for a team", it also includes the "no team" option, that's just to simplify the text):
RemoveProfile
when a setup assistant is deleted.Keeping in mind that Apple's API is a third-party dependency that could be down at any time/not under our control (much like the Jira/Zendesk integrations, and unlike our mysql DB), the plan here is to have some retry in case of failure, and be able to achieve whatever action we are doing without requiring the Apple API call to succeed (that is, we don't want to fail a team transfer or setup assistant upload due to an Apple API call failing).
Without going into the "how", I think those are the requirements of this ticket. If we agree on that I'll add a comment with an implementation plan before jumping into the code, to make sure all cases are properly covered.
@mna wow, that looks amazing, two comments:
TBD: when a custom setup assistant is deleted for a team, should we call the RemoveProfile API? Based on the docs, I'd say no ("After this call, the devices in the list will have no profiles associated with them. However, if those devices have already obtained the profile, this has no effect until the device is wiped and activated again.")
calling a RemoveProfile
will unenroll the profile from Fleet's MDM, I don't think we want to do that. However I think this brings up something that's currently unspecified: what happens if a device is assigned to a team that doesn't have custom setup assistant settings?
As part of the DefineProfile API call, the custom or default setup assistant needs to be filled with URLs that point to a Fleet endpoint that will serve the XML mobileconfig profile "corresponding to" or "derived from" the setup assistant JSON.
fortunately (because I think it simplifies things) the XML is not derived from the JSON in any way. The only two important things are:
apple_mdm.GenerateEnrollmentProfileMobileconfig
url
and configuration_web_url
point to this endpoint with a valid token if MDM SSO is not configuredIf we wanted, we could even have a single token and use that in all url
and configuration_web_url
of all JSON profiles.
@roperzh
If we wanted, we could even have a single token and use that in all url and configuration_web_url of all JSON profiles.
Gotcha, I see, so the XML is always the same (minus, say, the token) regardless of the setup assistant.
what happens if a device is assigned to a team that doesn't have custom setup assistant settings?
It will get the default (JSON) profile (AFAIK the same one we currently generate and assign to all devices).
I'll update the spec comment with this new info, let me know if you have thoughts about the other TBD (old token vs new token, but now I'm not sure if it even matters, maybe we generate a token once and don't change it even if the setup assistant/profile UUID changes).
@mna
Gotcha, I see, so the XML is always the same (minus, say, the token) regardless of the setup assistant.
yes! more or less, the XML doesn't even have a token, the only variables used to generate the template are: the org name, the server URL, the SCEP challenge and a topic (none of them related to the JSON profile in any way)
Maybe there's a better way to model all of this so it's not so confusing, but yeah, the token is currently only used to gate the XML generation, all is stored in the same table probably for convenience.
I'll update the spec comment with this new info, let me know if you have thoughts about the other TBD (old token vs new token, but now I'm not sure if it even matters, maybe we generate a token once and don't change it even if the setup assistant/profile UUID changes).
I don't have a strong opinion, the only reason to change the token is that it would be nice to have a escape hatch to invalidate the token if for whatever reason you want to, but given that this feature is very complex as is, I'm all in to tackle that separately if it adds complexity.
@roperzh
the XML doesn't even have a token
Oh right! The URL with the token is actually in the JSON.
it would be nice to have a escape hatch to invalidate the token if for whatever reason you want to
Definitely.
this feature is very complex as is, I'm all in to tackle that separately if it adds complexity.
Yeah, let's see if it can fit easily into the implementation, otherwise punt on this for now.
Gotcha, I see, so the XML is always the same (minus, say, the token) regardless of the setup assistant.
Just to confirm, we are saying that it is ok to hard code all of the UUIDs in the template so that the UUIDs will the same even if the host switches teams?
@gillespi314
Just to confirm, we are saying that it is ok to hard code all of the UUIDs in the template so that the UUIDs will the same even if the host switches teams?
That's my understanding, yes. Looking more closely at the XML, it has the SCEP challenge part (which for now, with a static challenge, is always the same), the Access Rights part, which is a bit cryptic in part but it makes sense that it's the same for all (the Fleet MDM URL is static for a given fleet instance, and the topic is also always the same, it comes from the APNs certificate). The top-level section defines the organization, which is the same regardless of teams.
Although I would think we'll want to warn the users not to change organization details, especially the server URL, which would probably invalidate the profiles.
/cc @roperzh
@gillespi314 @mna
Yes, for good or bad, the enrollment profile is "special", it's not part of the mdm_apple_configuration_profiles
table, so it's not picked by the cron and it's not redelivered when a host switches teams, which is good.
Hard coding the UUIDs there is okay, it doesn't do any harm, nor there are clear advantages of making them dynamic, at least for now.
We'll have to revisit if these assumptions still hold when we do SCEP renewals, but if we need to make the UUIDs dynamic then (to force the enrollment profile to be reinstalled), I don't think having them hard coded now will cause any problems.
Although I would think we'll want to warn the users not to change organization details, especially the server URL, which would probably invalidate the profiles.
this is a great point, and I don't think we do a good job communicating this.
Although I would think we'll want to warn the users not to change organization details, especially the server URL, which would probably invalidate the profiles.
this is a great point, and I don't think we do a good job communicating this.
I remember reading in the Apple docs, that if you set an HTTP redirect from your old server URL to the new, things should work, so at least there's that resource if somebody needs to change the URL
@noahtalerman as discussed in the comments above, the enrollment profile contains URLs that point to the Fleet instance, along with the organization name.
When Fleet MDM is used, it would probably be a good idea to warn the user not to change the Fleet instance's URL (and, although probably less critical, organization name). Or if they absolutely need to do so, at least to put a redirect in place as mentioned by Roberto's comment above. (we should test this flow to make sure we properly document how it can be achieved and that it works even with an org name change)
@gillespi314 @roperzh the implementation plan, let me know if you think of anything that should be done differently. The high-level idea is to keep the cron similar to today to assign the Apple BM's default team's profile (or the default profile if none exists for that team) to the new devices, and to leverage the server/worker
package to process updates/deletions to setup assistant and host transfers. This ensures robustness as it will retry in case of failures of the Apple API, and will not prevent the actual action (update setup assistant, transfer host, etc.) from happening.
In more details, addressing each of the requirements noted in https://github.com/fleetdm/fleet/issues/10995#issuecomment-1532047238:
DefineProfile
, gets the profile uuid and stores itAssignProfile
for each mdm-dep-enrolled host in that teamDefineProfile
for the default profile if it doesn't already have a profile uuid, stores itAssignProfile
for each mdm-dep-enrolled host in that team, assigning the default profileDefineProfile
for the no-team setup assistant or the default profile if it doesn't already have a profile uuid, stores itAssignProfile
for each host in the list, assigning the no-team setup assistant or the default profileDefineProfile
if the new team does not already have a profile uuid (or the default profile)AssignProfile
for each host in the list to assign that team's profile (or default one)mdm_apple_enrollment_profiles
@mna that sounds like a plan to me! I will kick a background job in my head to see if I can think of anything to note, but right now looks amazing.
the enrollment profile contains URLs that point to the Fleet instance, along with the organization name.
@mna @roperzh if I'm understanding correctly, hosts won't be able to enroll to Fleet if the user changes the server URL in the profile. Is that right?
If it is, I think we should file an issue for adding a validation check to not allow the user to edit the URL.
@noahtalerman worth discussing in a separate issue I think, we're thinking about this scenario:
I think this is also true for osquery, so changing your Fleet web address is already a risky change, but I think we should document this, and also do our best to support this path.
I think this is also true for osquery, so changing your Fleet web address is already a risky change, but I think we should document this, and also do our best to support this path.
Ah, ok. Yes, adding this to the docs makes sense.
I'll add this to the product feature requests agenda to discuss prioritizing a product solution.
@noahtalerman @roperzh I noticed an interesting issue when testing manually with Apple's API. In https://github.com/fleetdm/fleet/blob/main/server/mdm/apple/apple_mdm.go#L195-L198 we unmarshal the user-provided JSON of the setup assistant in the godep.Profile
struct, and then marshal that struct back to JSON when calling the Apple API for DefineProfile
.
Apple's DefineProfile API docs mention this:
FLAGS_INVALID: The flags have been set incorrectly; is_mdm_removable can be set to false only if flag is_supervised is set to true.
The problem is that whenever we marshal this struct to JSON, if is_mdm_removable
is not provided, it will marshal as false
and is_supervised
is also false
if not provided. I think I can fix it by setting the struct's value to true
before unmarshalling, but let me know if you think of any other similar issues (not necessarily errors that Apple's API would return, but settings or combinations of settings we wouldn't want the users to use).
@mna awesome call-out thanks! I faced a similar thing in this PR: https://github.com/fleetdm/fleet/pull/11557, await_device_configured
also requires is_supervised
to be true
in order to work (it doesn't return an error, tho, the setting just doesn't take effect)
There, I'm automatically setting is_supervised
to true
if await_device_configured
is true:
@noahtalerman do you think it's okay if we do that? or should we validate that both settings are set and give an error instead? This is also something to keep in mind in regards to our discussion here https://github.com/fleetdm/fleet/issues/10744#issuecomment-1542783554 when we implement the UI to turn on/off await_device_configured
@roperzh @noahtalerman
I faced a similar thing in this PR: https://github.com/fleetdm/fleet/pull/11557, await_device_configured also requires is_supervised to be true in order to work (it doesn't return an error, tho, the setting just doesn't take effect)
If the Apple docs are accurate, it seems like this is only the case on iOS: "Ignored on iOS devices if is_supervised is false. This key is valid in X-Server-Protocol-Version 2 and later.". If so I'd leave Apple to do the validation and wouldn't automatically set IsSupervised
to true
as I suppose it has other effects. (also, the docs mention that all iOS devices will now be supervised automatically: "In iOS 13, all DEP devices will be supervised and the OS will ignore the is_supervised flag completely.")
Docs page: https://developer.apple.com/documentation/devicemanagement/profile
interesting! let me try again and get back to you, but I'm 99% sure await_device_configured
didn't work for me until I set is_supervised
to true
@roperzh gotcha, yeah it's totally possible that the docs are not accurate.
should we validate that both settings are set and give an error instead?
@roperzh @mna if the user sets is_mdm_removable
to false
or await_device_configured
to true
, I think we should show an error if is_supervised
is not set to true
.
I think this is the higher quality solution because Fleet is transparent to the user what settings are required.
I think this error should appear when trying to upload a DEP profile in the UI or via fleetctl.
This is also something to keep in mind in regards to our discussion here https://github.com/fleetdm/fleet/issues/10744#issuecomment-1542783554 when we implement the UI to turn on/off await_device_configured
This is a good point. At some point we'll set await_device_configured
and is_supervised
under the hood for the user and validate/error if they supply these in the DEP profile.
Will it be more difficult to do this^ later if we decide now to show an error if is_supervised
is not set to true
? What will migration look like?
@noahtalerman @roperzh
if the user sets is_mdm_removable to false or await_device_configured to true, I think we should show an error if is_supervised is not set to true.
For is_mdm_removable
, this is already the case, it will correctly show an error if is_supervised
is not set to true (Apple itself does this validation).
For await_device_configured
, Roberto will double-check that it does indeed require is_supervised
to be true to work (the Apple docs don't mention that this is the case, but they may be wrong). In general, I think we shouldn't do our own validation in addition to Apple's, as we may encode wrong validations or validations that are correct today but could be wrong tomorrow, with Apple changing how the fields behave.
But if we're totally sure that some combination of fields need to be set for something to work properly (and Apple doesn't enforce that combination for some reason), I agree that it's nicer to raise an error to the user. I just think we should be very careful with that approach as we don't control the meaning of the fields.
Moving back to In Review as it's best to QA it with the changes required by this convo.
Note for QA / @xpkoala : this was tested locally for registering the setup assistant with Apple and getting a profile UUID, but assigning the profile to a host could not be tested "for real" (it requires macOS to create VMs enrolled with DEP).
Note that when creating/updating/deleting a setup assistant or moving a host to a new team, it can take up to a minute for the new profile to be registered with Apple and assigned to the correct existing hosts (for new hosts being added via DEP, it will take however long it takes for Apple to send them as part of the Sync Devices API, and the cron that syncs new devices runs every minute by default, controlled by the mdm.apple_dep_sync_periodicity
config flag).
Apple's ABM, DEP Fleet now syncs with their flow Smooth as clouds, they grow
Tasks
1
2
Spec Details
See https://github.com/fleetdm/fleet/issues/10995#issuecomment-1532047238 .
Implementation plan, see: https://github.com/fleetdm/fleet/issues/10995#issuecomment-1533061340