fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.02k stars 419 forks source link

MDM migration tool might get an unexpected rate limit getting enrollment information #15929

Closed roperzh closed 5 months ago

roperzh commented 9 months ago

Fleet version: fleetd 1.19


💥  Actual behavior

Orbit errors with this during the migration:

ERR checking assigned enrollment profile error=\""show enrollment profile command: exit status 1: Error fetching Device Enrollment configuration - Request too soon. Try again later.\n\""

INF a request to renew the enrollment profile was processed but not executed because there was an error checking the assigned enrollment profile.

🧑‍💻  Steps to reproduce

  1. Using a host to which we failed to assign a DEP profile
  2. Enable MDM migration for that host
  3. In the server logs, observe how eventually fleetd hits the rate limit and reports the error
roperzh commented 9 months ago

we just saw another request error in with the same error message: https://fleetdm.slack.com/archives/C03EG80BM2A/p1704985121437869

roperzh commented 8 months ago

for context, this is what I have found:

  1. Prior to macOS 12.3, there were no rate limit for the profiles binary
  2. In macOS 12.3 a rate limit was introduced, and the command throws an error when you reach the rate limit
  3. In a posterior macOS version (still TBD the exact number), the -cached option was introduced, which obtains information from a local cache. The profiles binary will fallback to -cached instead of throwing an error when a rate limit is reached.

cc: @nonpunctual in case you know more about this or think my research is inaccurate.


As for this issue in particular, we run profiles show --type enrollment on the device to ensure that it has an ADE profile assigned before starting the migration.

I think this is happening because the server things the device has a profile assigned, but the assignment failed, thus this issue is two folded:

  1. We should solve https://github.com/fleetdm/fleet/issues/15461 first, and improve the server logic to not send a notification to migrate if the profile assignment failed
  2. We should add a backoff logic to fleetd, so if it receives the same server URL from profiles, on successive calls, it waits longer than the current 5 minutes after each call.

I'm going to mark this issue as blocked by #15461.

nonpunctual commented 8 months ago

So, I am not completely sure I understand the issue, but, is it correct to say that it stems from polling the profiles show --type enrollment command too often? If a more passive check would be better, the system_profiler binary also lists all Configuration Profiles on a device including the mdm profile:

% system_profiler -json SPConfigurationProfileDataType | jq '.[].[]._items[] | select(._name=="Fleet Device Management enrollment")'

{
  "_items": [
    {
      "_name": "com.apple.mdm",
      "spconfigprofile_payload_data": "{\n    AccessRights = 8191;\n    CheckOutWhenRemoved = 1;\n    IdentityCertificateUUID = \"BCA53F9D-5DD2-494D-98D3-0D0F20FF6BA1\";\n    ServerCapabilities =     (\n        \"com.apple.mdm.per-user-connections\",\n        \"com.apple.mdm.bootstraptoken\"\n    );\n    ServerURL = \"https://dogfood.fleetdm.com/mdm/apple/mdm\";\n    SignMessage = 1;\n    Topic = \"com.apple.mgmt.External.a1a99377-0db9-4bea-9cc6-b6aef954545a\";\n}",
      "spconfigprofile_payload_identifier": "com.fleetdm.fleet.mdm.apple.mdm",
      "spconfigprofile_payload_uuid": "29713130-1602-4D27-90C9-B822A295E44E",
      "spconfigprofile_payload_version": 1
    },
    {
      "_name": "com.apple.security.scep",
      "spconfigprofile_payload_identifier": "com.fleetdm.fleet.mdm.apple.scep",
      "spconfigprofile_payload_uuid": "BCA53F9D-5DD2-494D-98D3-0D0F20FF6BA1",
      "spconfigprofile_payload_version": 1
    }
  ],
  "_name": "Fleet Device Management enrollment",
  "spconfigprofile_install_date": "Wednesday, December 13, 2023 at 2:31:11 PM Eastern Standard Time (2023-12-13 19:31:11 +0000)",
  "spconfigprofile_organization": "Fleet Device Management",
  "spconfigprofile_profile_identifier": "com.fleetdm.fleet.mdm.apple",
  "spconfigprofile_profile_uuid": "5ACABE91-CE30-4C05-93E3-B235C152404E",
  "spconfigprofile_RemovalDisallowed": "no",
  "spconfigprofile_verification_state": "unsigned",
  "spconfigprofile_version": 1
}
roperzh commented 8 months ago

is it correct to say that it stems from polling the profiles show --type enrollment command too often?

thanks @nonpunctual, I think that's correct. IMO we should fix the root cause that makes us spam the profiles binary, but as part of fixing this I will add your method as a fallback if the currently binary doesn't support -cached

fleet-release commented 5 months ago

Migration tool slows, Yet in each pause, a promise, Smooth paths in cloud glows.