fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.07k stars 426 forks source link

Add option to disable MDM related crons on Fleet server when MDM is turned off #21059

Open ddribeiro opened 2 months ago

ddribeiro commented 2 months ago

Problem

A Fleet server with MDM features disabled will still run MDM related crons. For example, the mdm_apple_profile_manager cron runs every 30 seconds and the nanodep-syncer cron runs every 30 seconds.

After consulting engineering, I've confirmed this is not a bug and was intentionally designed so that a server restart would not be required to enable MDM features after uploading artifacts through the UI.

Customer: Preference would be to do a one time restart instead of 3 cross running per minute per node. Especially since restarting the server can be done without downtime. The cost of compute, logging, and extra nodes to start add up over time, and makes it harder to find root causes after an issue.

What have you tried?

I tried looking for a configuration option to control whether these MDM related cross run on my Fleet server with MDM features disabled. I did not see any options available to me.

Potential solutions

Fleet could add a server configuration option that would disable MDM related cron jobs from running when MDM features are not enabled. The downside of this is a server restart would be required to completely turn on MDM features. When MDM features are turned on, the MDM crons would run regardless of how the server is configured for this option.

What is the expected workflow as a result of your proposal?

If this proposed solution was implemented and I didn't want MDM related crons to run on my Fleet server, I could go into my server configuration file and edit it to disable them. This would reduce logging for features not being used and save on cost of compute over time.
JoStableford commented 2 months ago

Related to a Slack conversation

noahtalerman commented 2 months ago

A Fleet server with MDM features disabled will still run MDM related crons. For example, the mdm_apple_profile_manager cron runs every 30 seconds and the nanodep-syncer cron runs every 30 seconds.

After consulting engineering, I've confirmed this is not a bug and was intentionally designed so that a server restart would not be required to enable MDM features after uploading artifacts through the UI.

Hey @roperzh is there a way we could make this more efficient for everyone?

Could we only run the crons if UI is turned on? (IT admin uploads APNs/ABM keys)

cc @ddribeiro

roperzh commented 2 months ago

@noahtalerman not easily with the way Fleet works today: crons are defined when the Fleet server starts and there's no way to add new crons.

Because certificates can be uploaded in the UI now, after the server is started, we need to keep running those in the background.

I want to note that if you don't have MDM configured, those are a noop, and extremely light functions. Have customers complained about CPU or memory usage from the crons? or are they bothered by the existence of the logs?

noahtalerman commented 2 months ago

Have customers complained about CPU or memory usage from the crons? or are they bothered by the existence of the logs?

@roperzh, looking at the end of the issue description, it seems like customers have brought up both.

@ddribeiro do you know how much extra use and how many extra logs they're seeing?

roperzh commented 2 months ago

@noahtalerman @ddribeiro it would be nice to confirm if they are actually seeing a performance impact or worried about it (which makes sense either way)

all MDM crons do this check before doing anything, which is cached in memory, so unless something unexpected is happening we shouldn't be consuming a significant number of resources:

https://github.com/fleetdm/fleet/blob/f8ab6d2fbdc7dc4b4963c7918c3beb37e571713c/cmd/fleet/cron.go#L1058-L1065

would it be enough to just reduce the logging?

noahtalerman commented 2 months ago

all MDM crons do this check before doing anything, which is cached in memory, so unless something unexpected is happening we shouldn't be consuming a significant number of resources

Thanks @roperzh! FYI @ddribeiro

Roberto, I learned from Dale that this isn't causing a huge amount of pain (performance costs or logs) so I think we can come back to it later.

It does make sense as a future optimization and it's a great point raised by the customer.