Open fpgaminer opened 2 years ago
Same root cause as #3526
This bug still exists in macOS 13.0 Beta (build 22A5266r), running as both host and guest. I tested with UTM 3.2.4.
So Apple hasn't fixed it in Ventura -- at least not yet.
This bug (Apple's bug) still exists in macOS 13.0 Beta 2 (build 22A5286j).
Apple's bug still exists in macOS 13.0 beta 3 (build 22A5295h).
Perhaps Applespecified it that way. They have moving away from kexts for awhile.
Apple's bug still exists in macOS 13.0 beta 3 (build 22A5295h).
That's possible. But it may also just be an oversight -- Apple still supports them in macOS 13 "on bare metal". Only time will tell.
I'll keep testing new Ventura betas. But I'll only report here if I find one that works.
Ah just came across this problem myself, sadly after I posted about it on the dev forums. Oh well, maybe I'll get a reply.
Still doesn't work in Ventura 13.0 and UTM v4.0.8
I reported the inability of enabling kexts on machines running via virtualization.framework as to Apple as FB10145502 (third party kexts cannot be used in Virtualization framework apps) a bit ago and as yet have heard nothing, for what it's worth
To follow up on this, using bputil from a terminal in recovery mode gives what's probably the most useful hint: Error Domain 401, "Failed to create local policy", underlying error 0x600001f64150, domain=com.apple.bootpolicy, code=2, "SEP communication (2)"
So it's almost certainly going to be that Virtualization.framework neither provides access to the real SEP nor emulates one fully, and so, until Apple decides to properly support Virtualization, we're just sad.
@dariaphoebe, I'm thinking of trying to debug this on an Intel Mac, since I can use HookCase there. Any suggestions how I might go about it, or is it a lost cause? Is the Intel environment just too different from the Apple Silicon environment?
If I'm really lucky, I might be able to find a hackish workaround for the problem. Something like https://github.com/utmapp/UTM/issues/3904#issuecomment-1100924393. But I might also end up wasting a lot of time.
Ah I only shared the log output in the Apple Dev Forums; the only lines I see are:
: Kcgen roundtrip failed with: Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM
: Kcgen roundtrip failed checkpoint saveAuxkc: status:error fatalError:Optional("Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM")
: Kcgen roundtrip failed: missing last checkpoint or errors found
: Deleting Preboot content
but could be unrelated.
I've made some progress debugging this problem. But I haven't, unfortunately, found a workaround. And what I have found indicates it happens at a very low level, and is almost certainly the result of a deliberate policy decision by Apple.
After much digging around in kmutil log show
logs, I ended up in more or less the same place as @lundman. As best I can tell, the error he found (BOOTPOLICY_ERROR_ACM
) is the key to this bug. This doesn't happen on bare metal, on either macOS 12 or 13.
One of the things I first noticed is that there isn't any "auxiliary kext collection" at all in macOS 12 and 13 VMs (created using the Virtualization framework). (Note the output from kmutil inspect
.) This despite the fact that one of its would-be components is an Apple-signed kext in /Library/Apple/System/Library/Extensions
-- RemoteVirtualInterface.kext
. (This is also not initially loaded on bare metal. It gets loaded along with your first third-party kext.) A necessary step in creating one is to create a "linked manifest" -- a file ending in *.im4m
that gets installed to a LocalPolicy
subdirectory of /System/Volumes/iSCPreboot
. This happens with no problems on bare metal. But in a VM you see the following errors:
macOS 13
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: entry
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: SEP command 33 (v7) returned 7: bpbc 5150, 0, 0, 0
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: assert: bpe == 0 (/AppleInternal/Library/BuildRoots/0d3d51bd-555a-11ed-88cc-a23c4f261b56/Library/Caches/com.apple.xbs/Sources/BootPolicy/dylib/dylib.c:2437)
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: exit: ACM (7)
kcgend: Could not createLinkedManifest and/or update auxKC receipt: Boot policy error: Error creating linked manifest: code ACM (7)
macOS 12
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: entry
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: SEP command 33 (v7) returned 7: bpbc 4177, 0, 0, 0
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: assert: bpe == 0 (/AppleInternal/Library/BuildRoots/d23df31a-1fa6-11ed-91f3-e2c7c6032f02/Library/Caches/com.apple.xbs/Sources/BootPolicy/dylib/dylib.c:2558)
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: exit: ACM (7)
...
kcgend: persisting error: Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM
...
kernelmanagerd: Kcgen roundtrip failed with: Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM
So a "linked manifest" is an expression of "local policy", and also of "boot policy". I'd guess "SEP" stands for something like "Secure Enclave Proxy". It's accessed through several software layers (including /usr/lib/libbootpolicy.dylib
), the lowest of which seems to be /System/Library/Extensions/BootPolicy.kext
. I don't know if the Virtualization frameworks' VMs just emulate Apple's "Secure Enclave" (I suspect they do). But, unlike @dariaphoebe, I had no problems communicating with the "SEP" in my VMs, on either macOS 12 or 13.
The "SEP command" that fails in VMs is "SEP command 33". It always succeeds on bare metal (at least in my experience). I see no references to it online. Over time I'll try to learn more about it. If Virtualization framework VMs emulate the "Secure Enclave", it should be possible, at least in principle, to mess with that emulation, to make "command 33" succeed. I'm sure this won't be easy, though, and I may never learn how to do it.
Someone at Apple must have made a deliberate decision to make "command 33" fail in Virtualization framework VMs. That sits oddly with other indications that they intended to support third-party kexts in these VMs -- for example they let you allow third-party kexts in the Recovery Partition. But Apple is a large company, and this won't be the first time the left hand doesn't know what the right hand is doing.
My logs have lots of entries headed "kcgend:". It's completely undocumented. But as best I can tell it's what actually writes "linked manifests" and kernel caches as your Mac boots. It's not meant to run in a VM, and normally doesn't (the result of another deliberate decision by someone at Apple). But I was able to find a way around this: Just run sudo /usr/libexec/kcgend
once in a VM, and it will always run on subsequent boots. Be aware that this will reboot your VM. When I first tried this trick I waited with bated breath for the reboot to finish -- only to discover that this wasn't the workaround I'd hoped it would be :-(
I've learned that "SEP" means "Secure Enclave Processor" ... probably. And "ACM" might mean "Authenticated Code Module".
After more digging around, I discovered there are three kernel extensions that can receive "SEP commands". (These are the only three kexts that have readFromSEPBuffer()
commands.)
BootPolicy.kext
AppleCredentialManager.kext
AppleSSE.kext
Then I wrote a dtrace script to trace when any of them receives a SEP command. SEPCommand.d.txt
Rename this file from SEPCommand.d.txt
to SEPCommand.d
, then run sudo dtrace -s SEPCommand.d
on a machine that hosts UTM VMs that use the Virtualization framework. You may first need to turn off SIP (with csrutil disable
while booted into the Recovery Partition).
The easiest way to trigger SEP commands to BootPolicy.kext
is to check for macOS updates (in the System Preferences panel). The easiest way for AppleCredentialManager.kext
is to run Safari.
But you don't see anything when you run or reboot a UTM VM (that uses the Virtualization framework). This tells me that Apple's implementation of the Secure Enclave in the Virtualization framework is an emulation -- possibly not a complete one. It doesn't call out to the Secure Enclave Processor in the host.
Another interesting bit: There are three "AppleVP" kexts, one of which (AppleVPBootPolicy.kext
) also contains a readFromSEPBuffer()
method. I missed it because dtrace
(on the host) doesn't list of any of its methods, because it's never loaded on a "host". All three of these only run in a Virtualization framework VM. Unfortunately, dtrace
won't work on any of them there. Presumably these contain at least part of Apple's emulation of the SEP.
AppleVPBootPolicy.kext
AppleVPKeyStore.kext
AppleVPCredentialManager.kext
More interesting stuff: All but one of the "SEP methods" in AppleVPBootPolicy.kext
are no-ops. (The only one that isn't is BootPolicyUserClient::validateSEPCommand()
, called from BootPolicyUserClient::extPerform()
.) I'd bet this means that all boot-related emulation of the SEP is implemented there. So it might be there that we should look for an explanation of the "SEP command 33" problem.
I've found out what "SEP command 33" is -- _command_create_linked_manifest()
in AppleVPBootPolicy.kext
. There's an array named _command_functions[]
, and a pointer to _command_create_linked_manifest()
exists in it at offset 33 X 8.
I've also found where _command_create_linked_manifest()
fails, I think. My command of ARM64 assembler is a bit wobbly. But as best I can tell, _command_create_linked_manifest()
only returns 7
(== BOOTPOLICY_ERROR_ACM
) if its call to _validate_acm_context()
fails.
So next I need to figure out why the call to _validate_acm_context()
fails.
I've discovered some information about why _validate_acm_context()
fails. But it's very cryptic, and I can't really interpret it:
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: entry
kernel: (AppleVPBootPolicy) BootPolicy: creating linked manifest
kernel: (AppleVPBootPolicy) BootPolicy: loading misc storage
kernel: (AppleVPCredentialManager) ACM: verifyPolicy: Verifying BootPolicy (V) on CS[100] (preflight=NO, secure=YES, checkKeybagUUID=NO).
kernel: (AppleVPCredentialManager) ACM: dumpVerificationStatus: BootPolicy (V) on CS[100] *NOT* SATISFIED: 01R -> R.
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: SEP command 33 (v7) returned 7: bpbc 4177, 0, 0, 0
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: assert: bpe == 0 (/AppleInternal/Library/BuildRoots/d23df31a-1fa6-11ed-91f3-e2c7c6032f02/Library/Caches/com.apple.xbs/Sources/BootPolicy/dylib/dylib.c:2558)
kcgend: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: bootpolicy_create_linked_manifest: exit: ACM (7)
There are many other cases where _validate_acm_context()
succeeds, even in a VM, even while booting. Compare this, picked more or less at random:
syspolicyd: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: reading blessed macOS local policy /System/Volumes/iSCPreboot/BE2ACA7D-8B5E-4E89-88F2-EE05D5354D88/LocalPolicy/1A59E46C2D579AD45C6282DE32E7B3989AD35CB73D301FD7FFE22D9F11454C59683535F41C9822598EE74081553DC5BA.img4
kernel: (AppleVPBootPolicy) BootPolicy: updating local policy User Authorised Kext List
kernel: (AppleVPBootPolicy) BootPolicy: loading misc storage
kernel: (AppleVPCredentialManager) ACM: verifyPolicy: Verifying BootPolicy (V) on CS[118] (preflight=NO, secure=YES, checkKeybagUUID=NO).
kernel: (AppleVPCredentialManager) ACM: dumpVerificationStatus: BootPolicy (V) on CS[118] SATISFIED: 01S -> S.
kernel: (AppleVPBootPolicy) BootPolicy: Local policy signing
kernel: (AppleVPBootPolicy) BootPolicy: AP hacktivation signing
syspolicyd: (libbootpolicy.dylib) [com.apple.BootPolicy:Library] BootPolicy: SEP command 23 (v7) returned 0
For what it's worth, all the failures happen when the call to _validate_acm_context()
is made (ultimately) from kcgend
. Calls from syspolicyd
, for example, work just fine. I noted above that I had to play a hackish trick to get kcgend
to run at startup on a Virtualization framework VM. Maybe it's not working entirely correctly. I'll keep playing with it. But at this point I'm not very optimistic I'll be able to do anything about its problems.
"SEP command 23" is _command_update_local_policy_uakl()
in AppleVPBootPolicy.kext
.
Above I theorized that these failures are the result of a deliberate policy decision by someone at Apple. I still think that. But now I believe I focused too much on the "SEP command 33" failures themselves. They may be just accidental. Now I suspect that the crucial decision was to not allow kcgend
to run in Virtualization framework VMs. I found a way to make it run. But (it seems) I haven't yet been able to make it run correctly.
My logs above came from the output (on a VM) of log show --predicate 'eventMessage contains "BootPolicy"'
. A lot of the lines are missing from the output of kmutil log show
. I'm discovering how much more useful log show
is than the Console app's GUI. Previously I thought most of the output you see when you start streaming in the Console app is simply lost if you're not streaming. I couldn't have been more wrong.
Interestingly, I get no output from running log show --predicate 'eventMessage contains "hacktivation"'
on my macOS host. I'm not entirely sure what to think about that.
I've found some potentially useful stuff. The AuxilaryStorage image appears to be an iBoot1 bootloader. According to https://github.com/AsahiLinux/docs/wiki/SW:Boot#stage-1-llbiboot1, it reads a local boot policy from /<volume-group-uuid>/LocalPolicy/<policy-hash>.img4
, which contains (among other things) a smb2
key, described as bool: Third-party kernel extensions enabled
.
According to that wiki page, it should be in "metadata keys", but although running strings
on that file (in the VM) does have some of those keys (such as vuid
and lobo
), I can't find the other keys (like smb*
or sip*
).
I'm thinking of asking the how to get these metadata keys on the Asahi IRC, but I'm not sure which channel to use (as they don't have an offtopic channel). The closest I can think of is #asahi-alt - For discussion and support for unofficial third-party distro ports
.
Yes, @ktprograms, it'd be interesting to know the contents of the local boot policy file in a VM that's been set up to support loading third party kexts (while booted into the Recovery Partition). I'd bet the bool: Third-party kernel extensions enabled
key is set to true
, and that it makes no difference -- that "SEP command 33" still fails.
I've got some more ideas about how to make kcgend
do what it should in a VM. I'll report back in the next few days whether or not they worked out.
They're based on two observations, one of which is a new discovery:
/System/Library/LaunchDaemons/com.apple.kcgend.plist
has the following two keys. As best I can tell (from xnu kernel source code), osenvironment
is stored in the "device tree" under /chosen
. I want to figure out how to read the "device tree", and how to manipulate this property in it.<key>RunAtLoad</key>
<true/>
<key>LimitLoadToHardware</key>
<dict>
<key>osenvironment</key>
<array>
<string>kcgen</string>
</array>
</dict>
AppleARMWatchdogTimer.kext
that loads on bare metal, but not in a VM (even sudo kmutil load -p /System/Library/Extensions/AppleARMWatchdogTimer.kext
won't make it load). It's used to create a "watchdog monitor" on bare metal, which logs messages with the format kcgend: Watchdog ...
(these appear in the output of kmutil log show
). This doesn't happen in a VM. The messages from kernelmanagerd
and kcgend
are grouped differently (in kmutil log show
output) on bare metal and in VMs. I wonder if the watchdog monitor is used to coordinate the actions of kernelmanagerd
and kcgend
. The problems with kcgend
on VMs, where this monitor doesn't operate, may be due to these two daemons stomping on each other's work. With luck, this problem will be solved by figuring out how to manipulate the "device tree".The page at https://github.com/AsahiLinux/docs/wiki/SW:Boot is very interesting. I notice that it mentions (and at least partially defines) "AP" and "kcOS" -- both of which show up in my research (in logs and in kexts' machine code).
I want to figure out how to read the "device tree", and how to manipulate this property in it.
Anyone know of a utility that can do this? It will probably need explicit Apple Silicon support.
A quick look on the web tells me I may need to write my own -- possibly as a "driver extension" (https://developer.apple.com/documentation/kernel/implementing_drivers_system_extensions_and_kexts).
Edit: I already tried sudo nvram boot-args="osenvironment=kcgen"
. That didn't work. It had no effect on kcgend
, and didn't change the value of hw.osenvironment
(as seen by the output of sysctl hw.osenvironment
). And no, sudo sysctl
can't be used to change the value of hw.osenvironment
.
Edit: sudo nvram osenvironment="kcgen"
doesn't work, either :-(
Edit: I also tried sudo nvram "chosen/osenvironment"="kcgen"
. That didn't work, either.
Edit: The ioreg
utility can read the device tree -- though maybe not all of it. But I haven't yet found anything that can write to it. Actually, you need to use ioreg -p IODeviceTree
to display the device tree, including its chosen
branch.
Edit: The source code for Apple's ioreg
utility is available -- https://github.com/apple-oss-distributions/IOKitTools/tree/IOKitTools-122. With luck, I can use that to figure out how to write the device tree, without having to mess with writing a driver extension. But it will still take me a while -- possibly a couple of weeks. In which case I probably won't be able to continue my debugging here until sometime next year.
WRT reading the local boot policy: I think bputil
can already do this. For example, here's the output of sudo bputil -d
on one of my VMs. Notice the value of "3rd Party Kexts Status":
Current local policy:
OS environment:
OS Type : macOS (overriden)
OS Pairing Status : Paired
Local Policy Nonce Hash (lpnh): 5FC3F6F1403AAEB8EB7694D3F091C82588F2AF5621CC431A4681623BE97BB7FF30AF7E8C5E98CDD1E445F35EB3775139
Remote Policy Nonce Hash (rpnh): 9C1D68C5C7F10D6E57888625E86AD4DF2D33AE4EE91DBE2112B976D90AA34A474A22102D83C5C26F162D8D1E034B5F11
Recovery OS Policy Nonce Hash (ronh): 54F5741CEBFC75CB7B01C02968C712A12328B907F470C85DC88D92305542E557264B12BCD72D9BBA148C3298FE10DB3D
Local policy:
Pairing Integrity : Valid
Signature Type : Other
Unique Chip ID (ECID): 0x4D2E2033F0060E0B
Board ID (BORD): 0x20
Chip ID (CHIP): 0xFE00
Certificate Epoch (CEPO): 0x1
Security Domain (SDOM): 0x1
Production Status (CPRO): 1
Security Mode (CSEC): 1
OS Version (love): 21.7.217.0.0,0
Volume Group UUID (vuid): BE2ACA7D-8B5E-4E89-88F2-EE05D5354D88
KEK Group UUID (kuid): 00000000-0000-0000-0000-000000000000
Local Policy Nonce Hash (lpnh): 5FC3F6F1403AAEB8EB7694D3F091C82588F2AF5621CC431A4681623BE97BB7FF30AF7E8C5E98CDD1E445F35EB3775139
Remote Policy Nonce Hash (rpnh): 9C1D68C5C7F10D6E57888625E86AD4DF2D33AE4EE91DBE2112B976D90AA34A474A22102D83C5C26F162D8D1E034B5F11
Next Stage Image4 Hash (nsih): 22954A85F4FDE9B179529AC61EA78C31F00B0E22DF9BE92235BBD80F548B7752FBA4C98D9B31DE9081EAC8EC5DEDCF2A
User Authorized Kext List Hash (auxp): 0E3CE57A947E28CAADAAF29B9788D4740558B145F696AC594FB17F02CF284EF74B314AFA93A01DECBF7900373C2AC626
Auxiliary Kernel Cache Image4 Hash (auxi): absent
Kext Receipt Hash (auxr): absent
CustomKC or fuOS Image4 Hash (coih): absent
Security Mode: Permissive (smb0 && smb1): 1
3rd Party Kexts Status: Enabled (smb2): 1
User-allowed MDM Control: Enabled (smb3): 1
DEP-allowed MDM Control: Disabled (smb4): absent
SIP Status: Customized (sip0): 7f
Signed System Volume Status: Disabled (sip1): 1
Kernel CTRR Status: Disabled (sip2): 1
Boot Args Filtering Status: Disabled (sip3): 1
Apple actually has pretty decent documentation on the local policy file:
Here's a progress report ... or better yet a lack-of-progress report.
I pretty quickly discovered that the osenvironment
stuff is a red herring: I found kernelmanagerd: kcgen activation settings: [kcgen enabled]
messages on both the host and my VMs. This tells me osenvironment
is already being set correctly (at the appropriate time during the boot process) in Virtualization framework VMs.
Then I tried to force AppleARMWatchdogTimer.kext
to load in my VMs. Like I mentioned above, you can't just load it using kmutil load
. In fact the boot kext cache seems to be immutable. You can only add kexts, even Apple-signed ones, to the auxiliary kext cache. But of course that's currently not possible in Virtualization framework VMs.
In principle you should be able to create a custom kext cache (using kmutil create
) and load it (in recovery mode) using kmutil configure-boot
. The following two links show you how to do this:
https://kernelshaman.blogspot.com/2021/02/building-xnu-for-macos-112-intel-apple.html https://github.com/AsahiLinux/macvdmtool/blob/main/README.md
But I couldn't get it to work. I tried adding AppleARMWatchdogTimer.kext
to my custom kext cache, creating one with no additional kexts, and even loading the immutable kext cache (from Apple) as a custom kext cache. All failed in more or less the same way: I got a bunch of errors from AppleVPKeyStore.kext
, then kernelmanagerd: Kernel requested shutdown. Goodbye!
I suspect Apple's kernel.release.vmapple
(the kernel used in Virtualization framework VMs) just doesn't like what I'm trying to do. I could try building a custom kernel, following kernelshaman's instructions. But that would be very time consuming, and might just lead me down another rabbit hole.
I'm giving up on this, at least for the time being.
Since there was "Ask Apple" this week, I did. Quinn confirmed it's not going to work yet, and i'd consider that definitive.
I could try building a custom kernel, following kernelshaman's instructions.
I did this, and managed to get it to work. But yes, it took me down another rabbit hole. I did discover one useful bit of information, though: The failures have nothing to do with the kernel itself (custom or otherwise). In fact the kernel never runs at all when you try to use kmutil configure-boot
to load a custom kext cache. I found this out by putting an infinite loop in the custom kernel's code, very early in the boot process. It never gets hit.
This may be some kind of permissions issue (though I loosened them as much as possible). But I think it's more likely that code somewhere in kernelmanagerd
simply stops it from working.
Could you test if the loop gets hit on physical hardware?
It does get hit on physical hardware.
To play it safer, I changed the loop to a panic()
call. Then my machine kept rebooting until I pressed the power button long enough to turn it off. I didn't see the normal panic screen -- probably the kernel hadn't yet loaded support for it. To restore my machine to its previous state, I booted into recovery mode and ran bputil -f
. This deactivates the custom kernel and turns on full security.
So kernelmanagerd
(or something called via kernelmanagerd
) only prevents booting from a custom kernel in a Virtualization framework virtual machine. I'll dig into this further and try to find out exactly how.
I didn't do a full test of my custom kernel (and its custom kext cache) on physical hardware. I don't want to risk harming my machine. The logical place to do this kind of test is in a virtual machine. Clearly Apple has gone to some trouble to prevent tests of custom kernels and third party kernel extensions from working in a Virtualization framework VM. It's rather hard to understand why :-(
Thanks for testing that. I wonder if the VM's bootloader is somehow hardcoding "allowed" kernels. Do you think trying to boot a kernel with a different version that was used in another VM might work? (You could check with uname -a
, I think).
Also, is kernelmanagerd
used for creating bootable kernels, or does it do stuff at boot time? If it only creates the kernels, as long as it's "valid", I think it would be a bootloader problem.
I doubt that kernelmanagerd
hardcodes allowed kernels (though kmutil create
(which works through kernelmanagerd
) does insist (on macOS 13) that you have a KDK installed that matches the currently running kernel). Though if it does, it shouldn't be hard to find the code that implements this restriction.
kernelmanagerd
doesn't create kernels. It works with whatever kernel is "installed" -- the one that the boot kext collection is linked to. And as best I can tell it also doesn't create boot kext collections. Apple makes those in some other way, then puts them in the "right" location (which you can see by running kmutil inspect
). Once installed in this way, a boot kext collection is immutable, at least by the "normal" activity of macOS. What can be changed is the auxiliary kext collection. But, as we've discovered, this doesn't work in a Virtualization framework VM. And (on physical hardware) the boot kext collection can also be (temporarily) replaced by a custom kext collection.
From kernelmanagerd
's man page and from what I've been able to discover, kernelmanagerd
manages the kernel-specific part of rebooting -- from making requested (staged) changes to the auxiliary kext collection to triggering a kernel's reloading. As we've seen, it can abort this process and log the (misleading) error message kernelmanagerd: Kernel requested shutdown
. Then it gives you the choice of trying again or booting into the recovery partition.
Over the next few days (and possibly weeks), I'm going to dig into kernelmanagerd
and its friends, to find as much as I can about why they behave differently on physical hardware and in a Virtualization framework VM.
I'm still working on this, but very slowly. I've now ruled out kernelmanagerd
from involvement in the problem of custom kernels not working in Virtualization framework macOS guest VMs -- just as I previously ruled out the custom kernel itself. I re-ran my test of loading my custom kernel, this time timing it carefully so I'd know exactly where to look in the logs displayed (on the macOS guest VM) by log show
. I found nothing at all -- not a single line.
On a hunch that interesting logs might be found on the host (instead of the guest), I looked there too, for entries matching --predicate='process == "com.apple.Virtualization.VirtualMachine"'
. There were some log entries, but (as best I can tell) all related to the Virtualization process's interaction with the host machine. There wasn't anything at all about its interaction with the guest.
My next direction will be using HookCase to debug the Virtualization framework on Intel Macs. It can't virtualize macOS, but it can virtualize Linux and Windows. The work is likely to be very involved, and will probably take a long time.
I've already spent a lot of time using HookCase to debug kernelmanagerd
on Intel Macs. I learned quite a lot. But this, too, turned out to be a red herring. Among other things, I found that the kernelmanagerd: Kernel requested shutdown. Goodbye!
log message is displayed every time you shutdown or restart macOS. This really is a request from the kernel (kernelmanagerd
periodically uses kext_request()
to fetch all current "kext requests" from the kernel). But, of course, the kernel makes that "request" in response to a signal from userland (I haven't yet worked out exactly what kind).
Exactly the same nonsense still happens on the first macOS 14 beta (build 23A5257q). In a macOS 14 guest (running on a macOS 14 host), after using kmutil load -p /path/to/extension
, you keep getting the same "System Extension Error" every time you reboot and log in. Allowing the extension to load in the Security & Privacy system preferences panel makes no difference.
And once again, the same kernel extension loads just fine on the host.
I tested using UTM 4.2.5 (the current version).
I've made some progress reverse engineering the com.apple.Virtualization.VirtualMachine
process on an Intel box running macOS 13. The VM runs Linux. But (so far at least) none of what I've learned is relevant here.
This is a problem with SEP (or, more exactly, with its s/w replacement made specifically for .vmapple image because there is no real SEP). Kernel manager and boot policy app tries to query some sort of permissions to build a manifest for auxiliary kernel cache (AuxKC) using a special SEP command and during that an error is thrown which makes AuxKC building process to fail, thus any kernel extension you have approved via "Privacy & Security" are not incorporated into auxiliary kernel cache during reboot process when kcgend daemon is started. I was able to circumvent it by patching this check, after that this manifest was created properly and kcgend generated AuxKC properly, but there was another error with failed manifest's signature check in iBoot, so a second patch was needed. Sadly, the method with patching requires a lot of effort starting at AVPBooter and ending with kernel and some of system apps. I'm not sure if you can do it without patching and would like to know if there is any way to do that in a such way.
Very interesting, @flatz, that you were able to patch a bunch of system files. How did you do it? I assume you had to lower the security settings, but which values did you pick?
@steven-michaud Unfortunately you can't just lower security settings to do that. As I told earlier I've started from the Root of Trust, by patching ROM image (AVPBooter binary from Virtualization.fw) to skip signature checks of iBSS/iBEC, ramdisk, kernel, etc.
So it looks like the only file you patched is AVPBooter.vmapple2.bin
in the Virtualization framework.
What do you mean by "patch"?
Did you change the file on disk? If so, how did you manage that? Did you boot into Recovery mode and change it there? Did you also need to re-codesign the Virtualization framework? Did you need to set Permissive Security in the Recover mode's Startup Security Utility?
Or did you change the file's image after it was loaded into memory? If so, how did you do that?
I've learned on my own how to change macOS system files: How to Defang macOS System Protections.
I made a trivial change to AVPBooter.vmapple2.bin
(I messed up the spelling of one word in one of its error messages). Then I worked through the "Defang" document up to the end of its "Unlock the Boot Volume" section. I didn't need to make any of the suggested further modifications under "Running Modified Code". UTM worked just fine. I tested with the first macOS 14 beta as both host and guest.
I'll be playing with AVPBooter.vmapple2.bin
to see what I can find out.
Thanks, @flatz, for pointing out that it's an interesting target for investigation.
I'm glad to see that you got the point. But keep in mind, just patching of AVPBooter is not enough because it's just a single piece of boot-process, you need to patch iBoot too (which includes iBSS/iBEC and their duplicates in LLB/iBoot) to be able to make patches for kernel, etc.
Your end goal is patching ACM check in AppleVPBootPolicy.kext (which itself is embedded into kernelcache), this check (see function called __validate_acm_context) prevents generation of manifest for AuxKC.
I'm going to take a different approach. My theory (above and above) is that the _validate_acm_context()
failures are accidental -- that they're due to problems with timing. So I'm going to concentrate on a second problem -- that custom kernel caches don't work in Virtualization framework VMs. With luck it'll be easier to resolve than your brute-force approach to the AuxKC problem. And if I do resolve it, I should be able to learn more about the AuxKC problem, and possibly find a better workaround for it.
Ultimately only Apple can really fix the AuxKC problem. But so far they don't seem the least bit interested in doing so.
I've figured out how to use third-party kexts on Virtualization framework macOS clients! Please check it out.
Above I talked about custom kext caches not working in macOS guest VMs. My workaround for third-party kexts doesn't directly address this problem. But if Stage 2 iBoot (iBoot.img4
) no longer checks the kernel cache's "digest", you can substitute any kernel cache you want, including a custom kernel cache built using kmutil create
. Some will make your VM unbootable, or have other undesirable effects. But you can make whatever changes you please to the kernel cache.
I used kmutil create
to create a custom kext cache modeled exactly on the original (with exactly the same kernel extensions). That worked fine, not surprisingly. Then I tried adding AppleARMWatchdogTimer.kext
, to test out my theory that the _validate_acm_context()
failures are caused by timing problems. It made my VM unbootable :-( This doesn't disprove my theory. But it does show that Apple had a reason not to include AppleARMWatchdogTimer.kext
in its VMs' kernel cache.
Edit: I found a way to add AppleARMWatchdogTimer.kext
to a guest VM's kernel cache. That doesn't (by itself) allow you to load third party kernel extensions. But, though I can't prove it, I remain convinced by my theory that the _validate_acm_context()
failures are caused by timing problems. And I suspect Apple won't be able to fix these problems, and support loading third party kexts, without emulating wdt
hardware in their Virtualization framework.
Apple still hasn't fixed this bug, but I've discovered a sign they might be working on it.
Is anyone here at Apple's WWDC? If so, could you ask about this issue? It'd be interesting to hear some kind of official response. The WWDC started yesterday (2024-06-10) and runs through Friday (2024-06-14).
I haven't yet tested with macOS 15 (Sequoia), but should be able to in the next few days.
I've run macMini with Sequoia, and latest UTM with VM as Sonoma - that does not work. Just need to update VM to Sequoia as well.
Sadly, both host and VM are Sequoia, and the same error pops up when rebooting after Approval.
My Running Third Party Kernel Extensions on Virtualization Framework macOS Guest VMs document works as-is on a macOS 15 guest running on a macOS 15 host! With the alterations it describes, the guest can load third party kernel extensions.
Describe the issue 3rd party kexts, like macFUSE or the kext I am working on personally, won't load. After reboot I get a pop-up message that says "System Extension Error", "An error occurred with your system extensions during startup and they need to be rebuilt before they can be used. Please go to Security & Privacy System Preferences to re-enable them."
If I go to Preferences and do as it asks, after reboot the same message appears.
SIP is disabled, developer mode is on, etc.
The same kexts work fine on bare metal.
Configuration
Logs Output of
log show --debug | grep -i macfuse
: