Open roperzh opened 4 weeks ago
@roperzh do we know what the SCEP certificate lifespan is for the customer devices? I do know that some MDM systems will set this to a long lived value like 2099, so in those cases it would not be an issue. If the lifespan of the certificate is short lived, I would say that this would be a P2 blocker issue.
@dherder good point! we should check with them, I know that micromdm/scep uses 1 year by default (-crtvalid
flag) so unless they provided a custom value there, it's 1 year
@lukeheath @noahtalerman per the process, letting you know that we have this as workflow/migration blocking and added the p2 label. let me know if anything else needs to be done to escalate
@zayhanlon P2 makes sense to me. Our response for P2 is:
Response: Issue is prioritized at the top of the next sprint. If opporunity cost of waiting for the next sprint is too high, it may be considered for current sprint.
We'll prioritize this for next sprint, which is scheduled to ship 7/15. Is that soon enough?
@noahtalerman @georgekarrv
@lukeheath @georgekarrv @noahtalerman - there's a thread going in #g-customer-success https://fleetdm.slack.com/archives/C062D0THVV1/p1718733547340419?thread_ts=1718384351.332159&cid=C062D0THVV1
This new issue was surfaced by Roberto this week but is also migration blocking. I don't think 7/15 will work - any way to get it faster or patched sooner?
@zwass @dherder FYI
I made it a story so it gets product feedback is that I personally only see three ways to accomplish this:
Thanks @roperzh!
I threw some time on your calendar to dig into the options.
we met with @noahtalerman and decided to do option 3 as a fist baby step:
We build a script that issues cert renewals that lives outside Fleet
I think this requires 3 action items:
cc: @zayhanlon
Thanks @roperzh!
Define where this service will be hosted, could probably live alongside the proxy?
I think we decided to go with fleetdm.com instead of standing up a separate service. Why? So we can reduce surface area and understandability for Fleet contributors.
If this doesn't work please let me know.
I think this means that the enrollment profile (XML) will live as an environment variable in Heroku. We'll probably need @eashaw's help to add that variable.
I updated the issue description to reflect this.
I have several questions:
I think we decided to go with fleetdm.com instead of standing up a separate service. Why? So we can reduce surface area and understandability for Fleet contributors.
Does this mean we would be putting customer SCEP cert/keys into fleetdm.com? That sounds pretty risky to me as I'm not aware that fleetdm.com has been designed/audited for storage of customer data (let alone important customer secrets).
Or maybe we are just talking about using fleetdm.com to trigger script execution for the hosts that are expiring? That seems potentially less risky but still something that would need to be well-understood. Would it require API keys for customer Fleet servers?
Does this mean we would be putting customer SCEP cert/keys into fleetdm.com?
@zwass I don't think so. The enrollment profile would be an environment variable in Heroku. Once the enrollment profile is delivered the host will get the new SCEP cert from the Fleet server
Would it require API keys for customer Fleet servers?
I think so yes. We need the API key to deliver the enrollment profile via the Fleet API. This can be stored an as environment variable in Heroku.
@roperzh please correct me if I'm wrong.
Who at Fleet will write the code? Who will maintain the code? Where will the server be hosted? Who will be responsible for maintaining it?
who's the right person to answer this? don't want it to get lost in the convo
How will the scripts be triggered? Is this something that the server becomes responsible for?
some process needs to run at an interval and send commands, we were thinking this separate server (let's say fleetdm.com) do it
the challenge of building the functionality directly into Fleet is related to crafting the right enrollment profile, we thought that having a separate service gives us freedom to hardcode the profile to the customer's needs.
@noahtalerman maybe the profile could be provided to Fleet itself as a hidden config?
@zwass another option I just thought of: what if the proxy enqueues the command (using Fleet's API) to renew the SCEP certificate the first time it redirects a host to Fleet? this gives us 1 year to properly solve this problem.
Who at Fleet will write the code? Who will maintain the code?
It's on the drafting board w/ the #g-mdm
label. I think let's treat this as all other user stories at Fleet: bring it through estimation and into the next sprint.
Since this it sounds like the next release (2024-07-15) isn't fast enough I started a thread in #g-mdm
in Slack here (internal) to chat about priority.
maybe the profile could be provided to Fleet itself as a hidden config?
@roperzh good idea. But is this because of a limitation of Heroku? If not, in order to move quickly, I think let's move forward with the current plan in the issue description.
If folks disagree, please bring jump in tomorrow's MDM design review to discuss.
Once we know what the enrollment profile will look like, we can get @eashaw's help to test. If we learn that using fleetdm.com won't work due to a Heroku limitation then I think we come back to other options.
@roperzh good idea. But is this because of a limitation of Heroku? If not, in order to move quickly, I think let's move forward with the current plan in the issue description.
If folks disagree, please bring jump in tomorrow's MDM design review to discuss.
@noahtalerman sounds good! yeah, not a limitation with Heroku, but it might be simpler to run the cron in Fleet because:
what if the proxy enqueues the command (using Fleet's API) to renew the SCEP certificate the first time it redirects a host to Fleet?
This seems possible. Currently there is no state maintained within the migration proxy, but state could be added.
Hey team! Please add your planning poker estimate with Zenhub @dantecatalfamo @ghernandez345 @gillespi314 @roperzh
Please add your planning poker estimate with Zenhub @jahzielv
As part of the research for this ticket I:
/mdm/checkin
and /mdm/connect
to my Fleet server /mdm/apple/mdm
ServerURL
to be https://roperzh-micromdm.ngrok.io/mdm/connect
(keep the old MicroMDM server URL)CheckInURL
next to ServerURL
with the value https://roperzh-micromdm.ngrok.io/mdm/checkin
PayloadIdentifier
of the profile to be com.github.micromdm.micromdm.enroll
InstallProfile
command using the enrollment profile payloadI verified that:
System Settings > Profiles
now shows the host as enrolled by FleetAction items and stuff to coordinate on:
FLEET_SILENT_MIGRATION_ENROLLMENT_PROFILE
with the profile contents@roperzh are you saying you got the enrollment profile replaced without user intervention? I'm not sure I understand how this experiment is connected with the touchless migration experience we are working on with customers.
@zwass sorry for not being clear. This is to renew SCEP certificates for migrated devices (which is done by re-delivering the enrollment profile)
The enrollment profile was almost replaced, but three things need to be kept in our particular case:
1. change ServerURL to be https://roperzh-micromdm.ngrok.io/mdm/connect (keep the old MicroMDM server URL) 1. add a CheckInURL next to ServerURL with the value https://roperzh-micromdm.ngrok.io/mdm/checkin 1. change the root PayloadIdentifier of the profile to be com.github.micromdm.micromdm.enroll
Ah, so enrollment profiles can be redelivered without user intervention as long as the ServerURL
and CheckInURL
don't change?
@zwass exactly! in my notes I have this as the full list of things that can't change:
PushTopic
ServerURL
CheckInURL
I think the really important findings for us are:
@roperzh how are we doing on target ETA to get this in a patch next week? thanks :D
@zayhanlon thanks for checking, still on track! but please note that the issue w/profiles is probably a bigger blocker. This is majorly a blocker for the prod deploy, the profiles is limiting their testing in staging prior to any production changes.
@roperzh yup! i'm on it - discussing with Noah today
Paired w/ Roberto to test on his locally setup mircomdm server to ensure the workflow succeeded.
Goal
Context
To renew SCEP certificates, we send an
InstallProfile
command with Fleet's enrollment profile to the devices.Hosts that migrated using "Process for self-hosted macOS MDM migration to Fleet" (#19387), will have a different enrollment profile (one from the old MDM solution), so the
InstallProfile
command will fail and the SCEP certificate won't be renewed.Changes
Product
Engineering
QA
Risk assessment
Manual testing steps
Testing notes
Confirmation