Closed gillespi314 closed 1 year ago
@gillespi314 Does this need reproducing from me or is there enough information for work to start on this?
@xpkoala no need for reproduction
@gillespi314 Assigning you to determine with @lucasmrod and @roperzh to determine:
@lukeheath added to the product board. This might have been lost a week or two ago.
Thanks for reporting this @gillespi314!
Tech debt is one of those words that can mean different things to different people, so just dropping in some food for thought re: prioritization:
What's the impact (worst case scenario) to users of leaving this unresolved? If not broken, would it feel unexpected? How hard would someone have to be looking for it to feel unexpected?
What about the worst case scenario to contributors? Will it lead to misunderstandings and bugs? Will it lead to introducing misinformed changes? What about if there's a production incident? Does this make it harder to understand what's going on in the database? Would someone dealing with an urgent issue misunderstand the state of the database because of this?
If there's sufficient user impact or violation of expectations, it's a bug and worth prioritizing an unplanned bug fix ASAP in the current sprint.
If there's sufficient contributor impact / risk, then it's related to #contributor-experience
and is worth prioritizing for the next sprint.
From what I can read here, it sounds like not fixing this means that user databases are getting corrupted with data that will have to be cleaned up in a future database migration job. The longer we wait to fix it, the more permutations of corrupted data there are that could trip us up and that we'll eventually have to fix.
To me, it sounds like it might already be a bug with user impact as well, or it might not be, but that we don't actually know for sure, because we haven't let this run and get corrupted in every possible way, and then done manual QA with automated tests.
If so, then It's probably easier to just fix it than to find out definitively, right?
Up to @lukeheath
Taking a pass at this:
As a user in Fleet who had recently fiddled around in (or accidentally messed up) the MySQL database, I don't want Fleet to silently fail to track activity and pretend nothing went wrong.
Closing out in favor #9915. Thanks @gillespi314
Error resolved, swift Cleanups and logs prosper Cloud city gleams bright
🧑‍💻  Expected behavior
When a device checks out (unenrolls) from Fleet MDM, we want to log that activity and cleanup related entries in the database. As part of the log entry, we want to indicate whether the device is checking out of automatic (DEP) or manual MDM. If there is an error getting the manual/automatic MDM status from the DB, the cleanups should still proceed and the activity log should be updated with an indication of missing information.
đź’Ą Â Actual behavior
An error reading manual/automatic MDM status from the DB short circuits the cleanups and activity logging.
More info
Consider restructuring the checkout method to handle these scenarios gracefully. https://github.com/fleetdm/fleet/blob/3b942030c989483ec7f53069e4f9b2b930151858/server/service/apple_mdm.go#L1006-L1026