steven-michaud / SandboxMirror

Tool for reverse-engineering Apple's sandbox
MIT License
54 stars 7 forks source link

System Crashes on El Capitan and Latest Sierra Beta #1

Closed hafta closed 7 years ago

hafta commented 7 years ago

Hey Steven, I gave this a try on 10.11 and the latest 10.12 beta and built the daemon and kext with xcode, but ran into OS panic's when trying to do some tracing.

The OS X crash log doesn't tell me much apart from xnu-3248.60.11/osfmk/i386/trap_native.c:168 and that SandboxMirror was in the kernel stack trace.

I've never built an OS X kernel extension before so I may have made a mistake with that. (Appreciate the thorough instructions.)

One question: if I don't copy org.smichaud.sandboxmirrord.plist to /Library/LaunchDaemons, do I have to start sandboxmirrord manually?

* Panic Report * (from 10.11) panic(cpu 2 caller 0xffffff8015bcf1ba): "Double fault at 0xffffff8015ac26fc, registers:\n" "CR0: 0x0000000080010033, CR2: 0xffffff8200cc7ff8, CR3: 0x00000002387c00be, CR4: 0x00000000001627e0\n" "RAX: 0x0000000000000000, RBX: 0xffffff7f99679f90, RCX: 0x0000000000000000, RDX: 0xffffff8200cc8040\n" "RSP: 0xffffff8200cc8000, RBP: 0xffffff8200cc8020, RSI: 0x000000000000014c, RDI: 0xffffff8200cc80f0\n" "R8: 0x0000000000007d93, R9: 0xffffff8200ccae30, R10: 0xffffff8200ccaa30, R11: 0x055ac2ee24790028\n" "R12: 0x000000000000002c, R13: 0x0000000000000000, R14: 0xffffff8200cc80f0, R15: 0xffffff80378eddb0\n" "RFL: 0x0000000000010286, RIP: 0xffffff8015ac26fc, CS: 0x0000000000000008, SS: 0x0000000000000010\n" "Error code: 0x0000000000000000\n"@/Library/Caches/com.apple.xbs/Sources/xnu/xnu-3248.60.11/osfmk/i386/trap_native.c:168 Backtrace (CPU 2), Frame : Return Address 0xffffff81f61ace90 : 0xffffff8015adab52 0xffffff81f61acf10 : 0xffffff8015bcf1ba 0xffffff81f61ad070 : 0xffffff8015bece1f 0xffffff8200cc8020 : 0xffffff8015adf839 0xffffff8200cc8070 : 0xffffff7f9967035a 0xffffff8200cc9160 : 0xffffff7f99670572 0xffffff8200cc91d0 : 0xffffff7f9967a0f9 0xffffff8200ccb280 : 0xffffff80161187f3 0xffffff8200ccb2c0 : 0xffffff8015cd9956 0xffffff8200ccb640 : 0xffffff8015cdb242 0xffffff8200ccbd10 : 0xffffff8015cdd51b 0xffffff8200ccbf10 : 0xffffff8015cd6887 0xffffff8200ccbf60 : 0xffffff8016028701 0xffffff8200ccbfb0 : 0xffffff8015becd66 Kernel Extensions in backtrace: org.smichaud.SandboxMirror(1.0)[8A2E6427-A6C3-3346-BF4F-B2F9D7183B3F]@0xffffff7f9966d000->0xffffff7f9968bfff

BSD process name corresponding to current thread: firefox

steven-michaud commented 7 years ago

You got kernel panics on both OS X 10.11 and macOS 10.12? Was it on the same hardware (the same computer)? If so, tell me as much as you can about that hardware.

I agree that kernel panic reports aren't much help. But SandboxMirror does seem at fault here.

Building should be as simple as doing "xcodebuild" from the commandline for both SandboxMirror.kext and sandboxmirrord. Did you get any errors while building? What version of Xcode did you use?

As best I can tell, there isn't any way to load sandboxmirrord by hand. You must copy org.smichaud.sandboxmirrord.plist to /Library/LaunchDaemons. But running SandboxMirror.kext without sandboxmirrord shouldn't cause kernel panics. You just won't get any logs.

steven-michaud commented 7 years ago

Another question: Do you get a kernel panic every time you try to log something, or only with some kinds of attempts to log stuff. If only with some attempts, please give examples of both -- invocations that reliably trigger kernel panics and invocations that reliably don't.

hafta commented 7 years ago

You got kernel panics on both OS X 10.11 and macOS 10.12? Was it on the same hardware (the same computer)? If so, tell me as much as you can about that hardware.

Two different laptops. 1 MacBook Pro and 1 MacBook Air.

Building should be as simple as doing "xcodebuild" from the commandline for both SandboxMirror.kext and sandboxmirrord. Did you get any errors while building? What version of Xcode did you use?

XCode Version 8.0 (8A218a) on El Capitan, I've since upgraded the Sierra Beta machine to Sierra and am not sure which version of XCode the beta used. I'll retry on release Sierra. I think I tried xcodebuild, but I had to change my Apple ID in the XCode settings in order to get a successful build/sign. Once I was in XCode, I just used the IDE to build each project.

I tried doing some traces of Calculator.app (following the instructions) and didn't see any output, and then I tried again and hit the crash. I had tried with/without copying the .plist to /Library/LaunchDaemons. I haven't got any trace output to work yet. (There's probably something silly I'm doing wrong.) Thanks!

hafta commented 7 years ago

please give examples of both -- invocations that reliably trigger kernel panics and invocations that reliably don't.

OK, I'll retry and get back to you with the exact command I used.

steven-michaud commented 7 years ago

I just built SandboxMirror.kext and sandboxmirrord on a MacBook Pro (Retina, 15-inch, Mid 2015) running OS X 10.11.6, using the latest Xcode release (8.0 8A218a). I had no problems.

I only used xcodebuild from the command line, then copied the binaries into place as follows:

sudo cp -R SandboxMirror.kext /usr/local/sbin sudo cp sandboxmirrord /usr/local/sbin

Then I copied org.smichaud.sandboxmirrord.plist to /Library/LaunchDaemons and restarted the computer. (You need to do either that or "sudo launchctl load org.smichaud.sandboxmirrord.plist".)

Then, after restarting, I loaded SandboxMirror.kext manually:

sudo kextutil /usr/local/sbin/SandboxMirror.kext

Then I ran the following from the command line:

SM_TRACE=mach-lookup* SM_DOSTACK=1 /Applications/Calculator.app/Contents/MacOS/Calculator

Several sandboxmirrord entries showed up in the Console, as expected.

steven-michaud commented 7 years ago

It might be relevant that (so far) all my tests have been on versions of OS X with only en-US installed in the "Language & Region" pref panel.

steven-michaud commented 7 years ago

Another factor that might be relevant: All my tests have been on clean installs of OS X or macOS, not upgrade installs. (That's just my personal habit -- I think it makes life easier.)

steven-michaud commented 7 years ago

Something to watch out for, if you're testing on macOS (10.12):

To see anything in the Console, it must be running before you try to make SandboxMirror log something. Type "sandbox" in the Console's "Search" box to filter out lots of extraneous entries.

steven-michaud commented 7 years ago

Yet another thing to look for:

org.smichaud.sandboxmirrord.plist has the following section:

    <key>MachServices</key>
    <dict>
      <key>org.smichaud.sandboxmirrord</key>
      <dict>
        <key>HostSpecialPort</key>
          <integer>16</integer>
      </dict>
    </dict>

Look for any other plist file, in either /Library/LaunchDaemons or /System/Library/LaunchDaemons, with a HostSpecialPort key also set to '16'. If you find one, that's likely to be the source of your trouble.

steven-michaud commented 7 years ago

I think I may have stumbled on your problem while trying to use sandboxmirrord from a different kernel extension (a new one that I'm currently working on). Everything works fine using sandboxmirrord from SandboxMirror.kext. But I get kernel panics every time I make my new kernel extension try to communicate with sandboxmirrord.

It will take me a while to get to the bottom of this. But I suspect that using SandboxMirror.kext and sandboxmirrord together somehow makes the OS tie them together, and confuses it when you try to use some other kernel extension with sandboxmirrord.

I also suspect that you have multiple copies of SandboxMirror.kext in system directories, possibly including /Library/Extensions. You don't want it there. But just removing it (and rebooting) probably won't be enough to avoid the kernel panics.

I need to figure out how the OS generates this tie, and how to remove it.

steven-michaud commented 7 years ago

Oops, the tie-in theory turns out to be wrong: Even if SandboxMirror.kext has never been installed on a computer, I still get kernel panics trying to make my new kernel extension talk to sandboxmirrord.

But I've found a workaround: Add the following section to your org.smichaud.sandboxmirrord.plist:

    <key>RunAtLoad</key>
    <true/>

It seems the kernel panics don't happen if sandboxmirrord is already running.

I don't know why. And this is almost certainly an Apple bug. But I'm not sure I'll ever understand it fully.

Please let me know your results with this workaround. Before you try it I suggest starting again from scratch, after removing all existing traces of SandboxMirror. Use xcodebuild from the command line, then use the following commands to copy the binaries into place:

    sudo cp -R SandboxMirror.kext /usr/local/sbin
    sudo cp sandboxmirrord /usr/local/sbin
steven-michaud commented 7 years ago

I've just released a new version (1.0.1) that contains my workaround. Please try it out and let us know your results.

hafta commented 7 years ago

With the latest version I'm hitting kernel panics as soon as I load the extension with. As before, no useful information in the kernel crash report.

sudo kextutil /usr/local/sbin/SandboxMirror.kext

The steps I took were 1) remove all traces 2) download and rebuild with xcode 3) copy extension and daemon to /usr/local/sbin, copy the .plist file to /Library/LaunchDaemons, 4) reboot, 5) run the kextutil command above.

When I load up the project in xcode, it asks me to update the project for the latest version. I'm wondering if this is changing something in the project settings to cause the problem. I'll retry without making those changes.

I've only tested this on Sierra (the official release). I checked for other services in /System/Library/LaunchDaemons using HostSpecialPort=16, but don't see any. I'm poking around with launchctl to see if there is a more exhaustive way to check because there are services with close port numbers such as 18.

[I'm on irc.mozilla.org as haik if you'd like to chat in real time about it.]

steven-michaud commented 7 years ago

Try not updating the project. With luck that'll do the trick. But better still, try redownloading SandboxMirror and building both projects from the commandline (using xcodebuild), without ever having loaded either of them into the XCode GUI.

I don't see problems with anything else you've done.

Interestingly, I've loaded both projects (SandboxMirror.kext and sandboxmirrord) into XCode 8.0 (8A218a) on Sierra numerous times, and I've never been prompted to upgrade either of them. I've also had no problems building them in the XCode GUI, archiving them, exporting them, and then copying them into place (in /usr/local/sbin) using 'cp' at the command line. But I still want you to build using 'xcodebuild' on the commandline, because that's simpler (and less likely to cause trouble). I eventually want to find what triggers the problems you've been having, but first I want you to make successful builds :-)

I've found another way to check if the "CHUD port" (HostSpecialPort=16) is already in use. First delete any SandboxMirror plist files you may have copied to /Library/LaunchDaemons and reboot. Then run the following command at the command line:

sudo launchctl hostinfo

See if it lists a "chud port". It shouldn't.

steven-michaud commented 7 years ago

Note that to run xcodebuild from the command line, you'll need to have installed the "Command Line Tools (macOS 10.12) for XCode 8.0":

http://adcdownload.apple.com/Developer_Tools/Command_Line_Tools_macOS_10.12_for_Xcode_8/Command_Line_Tools_macOS_10.12_for_Xcode_8.dmg

steven-michaud commented 7 years ago

How have you been downloading the SandboxMirror distro? For the purposes of building and testing, I've just been clicking on the "Clone or download" button and choosing "Download ZIP".

If that isn't what you've been doing, please try it and see if it makes any difference.

hafta commented 7 years ago

I have the CLI tools installed and xcodebuild does work for me, but only after I update the project settings to use my Apple developer ID to sign the builds. Otherwise xcodebuild fails to build both projects due to a certificate signing error.

For the latest version I used the "Download ZIP" method, but earlier I cloned the github repo.

steven-michaud commented 7 years ago

Sigh, I'd forgotten you don't have a "Developer ID".

Try editing the project.pbxproj files directly and changing every instance of

CODE_SIGN_IDENTITY = "Developer ID";

to

CODE_SIGN_IDENTITY = "";

That will cause the builds to be unsigned (which works fine in my tests).

steven-michaud commented 7 years ago

I also wonder if you'll need to change or remove the following line:

ORGANIZATIONNAME = "Steven Michaud";

But try my other suggestion first.

hafta commented 7 years ago

I do have a developer ID, but xcodebuild and the xcode GUI still fail to build until I go into xcode and change the signing settings. But I've never used my developer ID to sign something on the systems I'm testing with so perhaps I don't have the defaults setup. Anyway, I'll try hacking the project file and see what happens.

steven-michaud commented 7 years ago

A nasty thought just occurred to me, that you might (somehow) be building the "wrong" kind of binaries. Run the "file" command on both of them:

    file sandboxmirrord
    file SandboxMirror.kext/Contents/MacOS/SandboxMirror

The results should be as follows:

    sandboxmirrord: Mach-O 64-bit executable x86_64
    SandboxMirror.kext/Contents/MacOS/SandboxMirror: Mach-O 64-bit kext bundle x86_64

Let me know if they aren't.

hafta commented 7 years ago

Still getting the kernel panic when loading the kext. I re-downloaded the files and edited the project file manually setting CODE_SIGN_IDENTITY = ""; and that let me build them cleanly with xcodebuild.

hafta commented 7 years ago

Some progress! Using remote kernel debugging, on 10.11.6, I was able to get into lldb when the panic occurs. I need to take it a step further to get a fully debuggable build of SandboxMirror, but this tells me the problem is likely to be the first arg to the strcmp() call at line 1705 in hook__mac_syscall in SandboxMirror.cpp:

  1700 int hook__mac_syscall(proc_t p, struct __mac_syscall_args *uap, int *retv)
  1701 {
  1702   int retval = ENOENT;
  1703   if (g_mac_syscall_orig) {
  1704     retval = g_mac_syscall_orig(p, uap, retv);
  1705     if (!strcmp(uap->policy, "Sandbox")) {
  1706       if ((uap->call == 0) || (uap->call == 1)) {
  1707         hook_apply_sandbox(retval);
  1708       }

lldb tells us we crashed in the kernel strcmp() due to EXC_BAD_ACCESS (code=5, address=0x85b69c06). Assuming those values are valid, (code=5 looks like a catchall that doesn't say much about the trap) it appears that uap->policy is a pointer (address=0x85b69c06) to a string in the user's address space and the string needs to be copied into the kernel with a kernel routine like copyinstr() before it can be dereferenced. See copy(9).

What I got from lldb:

(lldb) Loading 1 kext modules . done.
Process 1 stopped
* thread #4: tid = 0x0eee, 0xffffff8004f5b760 kernel.development`strcmp(s1=<unavailable>, s2=<unavailable>) + 16 at subrs.c:174, name = '0xffffff801a2a0748', queue = '0x0', stop reason = EXC_BAD_ACCESS (code=5, address=0x85b69c06)
    frame #0 kernel.development`strcmp() at subrs.c:174

(lldb) bt
* thread #4: tid = 0x0eee, 0xffffff8004f5b760 kernel.development`strcmp(s1=<unavailable>, s2=<unavailable>) + 16 at subrs.c:174, name = '0xffffff801a2a0748', queue = '0x0', stop reason = EXC_BAD_ACCESS (code=5, address=0x85b69c06)
  * frame #0 kernel.development`strcmp() at subrs.c:174
    frame #1 SandboxMirror`hook__mac_syscall() at SandboxMirror.cpp:1705
    frame #2 kernel.development`unix_syscall64() at systemcalls.c:384
    frame #3 kernel.development`hndl_unix_scall64() 
(lldb) frame variable
(const char *) s1 = <variable not available>
(const char *) s2 = <variable not available>
(unsigned int) a = <no location, value may have been optimized out>
(unsigned int) b = <no location, value may have been optimized out>

(lldb) up
SandboxMirror was compiled with optimization - stepping may behave oddly; variables may not be available.
frame #1 SandboxMirror`hook__mac_syscall() at SandboxMirror.cpp:1705
   1702   int retval = ENOENT;
   1703   if (g_mac_syscall_orig) {
   1704     retval = g_mac_syscall_orig(p, uap, retv);
-> 1705     if (!strcmp(uap->policy, "Sandbox")) {
   1706       if ((uap->call == 0) || (uap->call == 1)) {
   1707         hook_apply_sandbox(retval);
   1708       }
(lldb) frame variable
(proc_t) p = <variable not available>
(__mac_syscall_args *) uap = 0xffffff801a392000
(int *) retv = <variable not available>
(int) retval = 45
(lldb) print uap
(__mac_syscall_args *) $4 = 0xffffff801a392000
(lldb) print *uap
(__mac_syscall_args) $5 = (policy = <no value available>, call = 4, args = 123145304521528)

Getting the kernel base address to see if the uap->policy address was a valid kernel address or not. Note: I skimmed the xnu source for what looked like the kernel base address and this may not be correct.

(lldb) print kernel_map->hdr.links.start
(vm_map_offset_t) $13 = 18446743521806254080

If 18446743521806254080 is the base of the kernel, that's 0x7FFFFFFFFFFFFFFF > 0x85b69c06.

hafta commented 7 years ago

With the added call to copyinstr, no more crashes and I'm seeing sandboxmirrord events in the console. So far I've just tested SM_TRACE=mach-lookup* SM_DOSTACK=1 /Applications/Calculator.app/Contents/MacOS/Calculator. Sent a pull request with the change.

steven-michaud commented 7 years ago

Thanks, Hafta! I'm glad you stuck to it until you figured it out!

I'm going to change your patch a little bit, and do some testing. Once I'm done I'll land it on the master branch. Before I do another "release", though, I think I'll also get rid of the signing. First I need to check that won't cause trouble on older versions of OS X (back to Mavericks). But if the signing is truly unnecessary, it's just a possible source of unwanted grief.

I haven't been using Apple's kernel debugging infrastructure. So far I've been coding and testing in small increments, which makes it relatively easy to figure out kernel panics by trial and error. But you've shown how useful it can be, so I'm going to start playing with it. With luck I'll be able to get it working between two VMWare virtual machines.

Needless to say (because of the possibility of kernel panics), kernel extension testing is best done in virtual machines. I write the code elsewhere, then copy it in, build it and run it.

steven-michaud commented 7 years ago

I hope and assume this is now really fixed :-)

steven-michaud commented 7 years ago

I had to land another patch to fix a mistake I made copying Hafta's patch:

https://github.com/steven-michaud/SandboxMirror/commit/1a26cca768f575e742e93f51764b83685d2e573b

steven-michaud commented 7 years ago

I've done my SandboxMirror 1.0.2 release, which includes Hafta's patch and removes the code signing settings and any mention of them in the documentation.

steven-michaud commented 7 years ago

Apple's standard procedure for kernel debugging doesn't work for targets that are VMWare virtual machines. But I was able to get it working (in both El Capitan and Sierra) using a slightly different procedure, outlined in the following excellent article:

http://ddeville.me/2015/08/using-the-vmware-fusion-gdb-stub-for-kernel-debugging-with-lldb

I'm only able to get "local" connections to work (from the VMWare Fusion host to one of its clients), using gdb-remote localhost:8864 (in lldb). So you can only have one potential target VM running at a time. (Using gdb-remote [hostname-or-ip-address]:8864 always gives me "connection refused".) But that's a small inconvenience.

So far I've only done minimal testing with this configuration. But I'll start doing more, in the hope that it will uncover bugs that (for some reason) don't trigger kernel panics on my systems.

hafta commented 7 years ago

To close the loop on this, it seems that some systems always crash when reading the user VA directly from kernel context and some never do. For systems that don't crash, reading of the uap->policy string directly from the user VA works reliably. The difference (crash vs. no crash) is attributed to an Intel CPU feature that isn't present on all Intel CPU's in Macs.

Apple's docs state that on 64-bit kernels, the kernel is mapped into the higher address ranges of user applications so that a context switch into the kernel can be more efficient. The user address space is in a lower VA range. As a result, a new cr3 doesn't have to be loaded when context switching into the kernel and some systems can reliably read user VA's directly from the kernel without any special setup. However, this shouldn't be relied on given that it's hardware and implementation dependent. The boot-arg -no_shared_cr3 can be used to get the kernel to use separate cr3's for user and the kernel and, when it's set, the kernel gets zero pages mapped to the (lower) user VA region which helps catch accidental user references from the kernel. With -no_shared_cr3, all tested systems always crash when reading a user VA from the kernel directly (i.e., without copyin* calls). The copyin* calls perform the appropriate setup, temporarily switching to the user cr3 to do the reads.

Without -no_shared_cr3, the reason some systems crash and some don't is CPU hardware differences. Some Intel CPU's include a hardware feature Supervisor Mode Access Prevention (SMAP) which causes access exceptions when the kernel reads a user VA without explicitly setting it up first. The OS X xnu kernel supports SMAP. If the system has SMAP support, xnu enables it resulting in panics if the kernel accidentally accesses a user VA. You can see if SMAP is present on your CPU with sysctl.

$ sysctl -a machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT FPU_CSDS

And if use of it is enabled by dumping the kernel global pmap_smap_enabled with dtrace.

$ sudo dtrace -q -n BEGIN'{printf("pmap_smap_enabled: %d",`pmap_smap_enabled);exit(0);}'
pmap_smap_enabled: 1

Tested on El Capitan and Sierra.

steven-michaud commented 7 years ago

Thanks, Hafta, for figuring all this out!

I don't have an SMAP machine. But I also saw this bug using the -no_shared_cr3 boot-arg, and doing that seems to be a good way to test for inadvertent access from the kernel to user addresses (access that doesn't use copyin() or copyinstr()). I'm glad to report that I haven't found any more such bugs.

markcarpentier commented 7 years ago

Hafta,

Can you please tell me how I can resolve this problem because I too have had multiple crashes where my screen just freezes and my Mac shuts down.

The crash reports also gives me this error

Error code: 0x0000000000000000\n"@/Library/Caches/com.apple.xbs/Sources/xnu/xnu-3248.60.11/osfmk/i386/trap_native.c:168

I'm trying to figure out how to resolve this but I am a complete noob in programming.

steven-michaud commented 7 years ago

It sounds crass, but you shouldn't be using SandboxMirror if you're not a programmer. It's a developer's tool and will only be useful for someone who's a very advanced programmer -- advanced enough to understand assembly code and how to reverse engineer system functions.

Your not being a programmer will also make it much more difficult to figure out the problems you're having.

markcarpentier commented 7 years ago

Ok I understand. Any suggestions where I can find a solution for my problem? Please don’t tell me the mac store because they say there is nothing wrong wth my computer :/

Op 30 dec. 2016, om 01:03 heeft Steven Michaud notifications@github.com het volgende geschreven:

It sounds crass, but you shouldn't be using SandboxMirror if you're not a programmer. It's a developer's tool and will only be useful for someone who's a very advanced programmer -- advanced enough to understand assembly code and how to reverse engineer system functions.

Your not being a programmer will also make it much more difficult to figure out the problems you're having.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/steven-michaud/SandboxMirror/issues/1#issuecomment-269712911, or mute the thread https://github.com/notifications/unsubscribe-auth/AXru9gk2lwLIlpLHQXstx2aan6CpKAJvks5rNEpigaJpZM4KFPAt.

steven-michaud commented 7 years ago

The solution for you is "don't use SandboxMirror".

If you haven't in fact been using SandboxMirror, then I don't know what to say, beyond that you're in the wrong place and nobody here can help you.