google / santa

A binary authorization and monitoring system for macOS
https://santa.dev
Apache License 2.0
4.42k stars 297 forks source link

Santa is Blocking SantaCtl #1206

Closed eopeter closed 11 months ago

eopeter commented 11 months ago

Hello we are experiencing an issue where Santa is blocking santactl and give the following on running santactl commands

user$: santactl version
Santa
This application has been blocked
Path:     /Applications/Santa.app/Contents/MacOS/santactl
Identifier: dab08c78470c6d8442fc9adfcba04bc06da6180c2f2f3a97a8ae2ddab2168a23
Parent:    bash (1702)
More info:
http://localhost:9333/v1/santa/blocked/dab08c78470c6d8442fc9adfcba04bc06da6180c2f2f3a97a8ae2ddab2168a23
Killed: 9

I ran the codesign on santactl and got this:

codesign --display --verbose=4 /Applications/Santa.app/Contents/MacOS/santactl
Executable=/Applications/Santa.app/Contents/MacOS/santactl
Identifier=com.google.santa.ctl
Format=Mach-O universal (x86_64 arm64)
CodeDirectory v=20500 size=72704 flags=0x12200(kill,library-validation,runtime) hashes=2266+2 location=embedded
VersionPlatform=1
VersionMin=720896
VersionSDK=852736
Hash type=sha256 size=32
CandidateCDHash sha256=f576c09b3ebe4b0c373d7de7304b4e1522ff3ebc
CandidateCDHashFull sha256=f576c09b3ebe4b0c373d7de7304b4e1522ff3ebcd5610cc01b56d9a3a7fb5950
Hash choices=sha256
CMSDigest=f576c09b3ebe4b0c373d7de7304b4e1522ff3ebcd5610cc01b56d9a3a7fb5950
CMSDigestType=2
Executable Segment base=0
Executable Segment limit=4734976
Executable Segment flags=0x1
Page size=4096
Launch Constraints:
    None
CDHash=f576c09b3ebe4b0c373d7de7304b4e1522ff3ebc
Signature size=8990
Authority=Developer ID Application: Google LLC (EQHXZ8M8AV)
Authority=Developer ID Certification Authority
Authority=Apple Root CA
Timestamp=Sep 28, 2023 at 2:17:51 PM
Info.plist entries=17
TeamIdentifier=EQHXZ8M8AV
Runtime Version=13.3.0
Sealed Resources=none
Internal requirements count=1 size=180
eopeter commented 11 months ago

It seem for the user experiencing this, their rules.db got wiped out while in lockdown mode when they upgraded the version of Santa from 2023.6 to 2023.8. The weird part is that it only has happened to this one user that we know off. Apple binaries were working, Santa daemon was fine but all others were blocked.

To resolve the issue, we had to remove Santa using the uninstall script but our MDM installed it again and the user became completely blocked again as they could not see the sync server to download the rules again. When the sync server became available after another uninstall; we removed everything in /var/db/santa except rules.db, event.db and santa.log which failed to delete; we then stopped the daemon by running /bin/rm -f /Library/LaunchDaemons/com.google.santad.plist; Rebooting and then santactl status shows we are in MONITOR mode. We then did a sync which took a long time because there were 17K pending events; Once the sync completed, the system was back working correctly again

pmarkowsky commented 11 months ago

Something seems off here. You shouldn't be able to block santactl. We specifically take steps to prevent Santa's components from being blocked as well as other critical system binaries. However this list is built when santad starts and is tied to the sha256 of the critical binaries.

The only thing I can think of is that if somehow the old system extension (santad 2023.6) was running and did not stop or get uninstalled during upgrade and as part of the upgrade your user had already overwritten santactl. In that scenario it might be be possible to end up in a weird state where the 2023.6 system extension would block the 2023.8 version of santactl. However restarting the daemon or at worst a reboot should fix this as the new version of the system extension would run and rebuild the critical system binary list.

@eopeter If it happens again can you give us the output of systemextensionsctl list?

eopeter commented 11 months ago

This is our post mortem for the incident and we have some proposed solutions that may need to be applied to Santa to prevent this from happening again. Do you see any issues with the proposed solutions? I can send PR if we agree on the solution:

Post Mortem: Santa Bricked an Endpoint

Introduction: This post mortem investigates an incident where Santa blocked most binaries on a user's workstation while they were in lockdown mode making the computer unusable except for the Santa Daemon and critical system binaries. The issue seem to arise due to a rebuild of the rules.db without any of the previous rules that were applied. Consequently, this lockdown prevented the user from accessing essential programs including Santa's sub-components required for downloading rules from the sync server. Furthermore, our VPN Client which provides access to the Sync Server with the applicable rules was blocked.

Positive Aspects:

  1. Swift initiation of troubleshooting measures upon identifying the issue to promptly restore user access.
  2. Limited impact, affecting only one known user's system.

Challenges Faced:

  1. Unsuccessful attempts to remove Santa entirely due to Jamf being blocked, impeding the execution of necessary unblocking steps. We have a jamf trigger for this.
  2. Inability to switch the user to Monitor Mode since the VPN client binary, which provides access to the sync server, was blocked.
  3. Santa blocking its crucial component, santactl, which is essential for rules download. This was totally unexpected due to the fact that critical system binaries and Santa components are allowed on daemon startup. The observed rules.db was empty with a single table "rules" and no items in the table. At the least, it should have had rules for the critical system binaries and santa components. Even on reboot, Santa components like santactl remain blocked.

Proposed Solutions:

  1. Modify Santa's functionality that prevents the blocking of its sub components to be more resilient. (should those rules not live in rules.db like user configured rules and instead be built into the binary?). It does seem to be cached today in setupSystemCriticalBinaries
  2. Implement a mechanism to designate critical client binaries, such as VPN clients, that should never be blocked, ensuring uninterrupted network access without requiring a sync server. Basically users ability to add critical binaries
  3. Only init a new rules.db if the client is in Monitor Mode. If the client is in lockdown mode when initing the rules.db, switch to Monitor mode until a successful sync of rules happen.
russellhancox commented 11 months ago

Santa blocking its crucial component, santactl, which is essential for rules download.

This isn't strictly true; santactl is used by a user manually initiating a sync but periodic syncs only depend on santasyncservice functioning.

Modify Santa's functionality that prevents the blocking of its sub components to be more resilient. (should those rules not live in rules.db like user configured rules and instead be built into the binary?)

The rules governing Santa's components and critical system binaries are not part of the rules database. When santad (or com.google.santa.daemon) starts, it inspects all of its sub-components along with a few known critical system binaries and pre-caches allow decisions for all of these. When evaluating policies these critical components are checked before anything else.

Implement a mechanism to designate critical client binaries, such as VPN clients, that should never be blocked, ensuring uninterrupted network access without requiring a sync server. Basically users ability to add critical binaries

The StaticRules configuration key can be used for this.

If the client is in lockdown mode when initing the rules.db, switch to Monitor mode until a successful sync of rules happen.

I'm concerned about the security of this. While we take some measures to prevent modification to the rules.db while santad is running, making the proposed change means that any process that is capable of deleting the rules.db can then get santad out of lockdown mode.

eopeter commented 11 months ago

Are StaticRules not ignored if sync server is used?

russellhancox commented 11 months ago

No, StaticRules take precedence over sync server rules.

eopeter commented 11 months ago

using the StaticRules worked great!

I was able to replicate the scenario: after deploying the new mobile config; while in lockdown, I turned off access to the VPN, went into SafeMode and deleted the rules.db which forced it to be recreated on next boot in lockdown mode with no rules. Everything was blocked except Santa, Santa Components, Apple critical binaries and the things specified in the StaticRules which includes our VPN client. Was able to log into VPN, the rules synced and back to normal.

Closing this issue for now. Thank you so much!!!