intel / haxm

Intel® Hardware Accelerated Execution Manager (Intel® HAXM)
BSD 3-Clause "New" or "Revised" License
3.21k stars 871 forks source link

Loading the haxm kext breaks mounting SMB volumes in Mac OS Sierra and High Sierra #19

Open brettp opened 6 years ago

brettp commented 6 years ago

When the haxm kext is loaded, the /dev/nsmb0 device is not properly configured. There is at least one other report of this problem.

I usually install haxm through Android Studio for Android development, and it consistently breaks mounting SMB shares from Ubuntu and Windows machines.

I would be happy to provide whatever logs might help debug this problem.

raphaelning commented 6 years ago

Thanks for the report. We tried on several different Macs, but couldn't reproduce the bug. With the HAXM kext loaded and even Android Emulator running, we can see our SMB share listed in the output of smbutil view and mount it in Finder just fine. In addition, cat /dev/nsmb0 returns "Resource busy" rather than "Device not configured".

To debug the issue, we'll need more details on either the "smbutil: server connection failed: No such file or directory" or the "Device not configured" error. Honestly I don't know where to look for relevant logs. You could check the HAXM kext log right after reproducing the issue:

log show --style syslog --predicate 'eventMessage CONTAINS "hax"' --last 5m

which will print all the HAXM errors and warnings from the last 5 minutes. But there's a good chance that it'll show nothing, since the attempt to access the SMB share shouldn't result in the invocation of any HAXM API.

BTW, are you the author of this answer:

https://apple.stackexchange.com/a/261882

If so, where did you see this error

1025 failures to open smb device, syserr = No such file or directory

and how did you know it was due to a conflicting kext?

brettp commented 6 years ago

Thank you for the reply. I can reproduce this every time by loading the haxm kext and then rebooting.

As you suspected, the haxm logs don't indicate any problem with smb.

I am the author of the answer you linked to. The "1025 failures" messages appear in the syslog, but only when haxm is loaded (i.e., when smb fails):

$ log show --style syslog --predicate 'eventMessage CONTAINS "smb"' --last 5m
Filtering the log data using "eventMessage CONTAINS "smb""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp                       (process)[PID]
2018-01-02 17:12:57.297926-0500  localhost kernel[0]: (smbfs) nsmb_dev_load: smb: cdevsw_add
2018-01-02 17:12:59.595160-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:12:59.855123-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:00.419896-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:00.542355-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:08.735364-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:08.851995-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:15.279223-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:15.292477-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:18.926559-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:13:18.937754-0500  localhost NetAuthSysAgent[743]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:14:17.306824-0500  localhost netbiosd[362]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory
2018-01-02 17:14:17.351567-0500  localhost netbiosd[362]: (SMBClient) 1025 failures to open smb device, syserr = No such file or directory

I discovered it was a conflicting kext out of desperation. I removed every non-Apple kext one at a time, rebooted, then tried to access the shares again. Finally unloading haxm let me access the shares.

I've attached 2 files with the output of dtruss -f -a -s smbutil view //brett@raspi, the first with the haxm kext loaded and the second without. I don't know how much use they will be, since it only shows smbutil unable to access the device.

with_haxm.log without_haxm.log

Let me know if I can provide any other logs to help track this down.

JayFoxRox commented 6 years ago

Might be a stupid comment as I can't find any documentation about devfs_make_node. However, given all example code I've read, I don't understand why there is an additional 0 parameter at the end. (Compare to printf style usage here)

Depending on the calling convention and how the kernel handles this, this might cause problems?

tautologistics commented 6 years ago

I just dealt with the same SMB/HAXM issue and right before that, an issue with VMWare Fusion ("unabled to connect to peer process") and osxfuse. I think the issue here is that on some machines people are hitting the OSX limit on device slots. Some VPN software, VMs (VirtualBox, VMWare Fusion, etc.), and FS (OSXFuse, Keybase FS, etc.), all take up one or more slots and when they're all taken, whatever is left out breaks.

Pretty sure this can be reproduced by installing HAXM, OSXFuse, Keybase, VMWare Fusion, and some other VPN software.

For reference:

raphaelning commented 6 years ago

Thanks, that sounds like the real cause. According to this post:

On El Capitan there were 8 free device slots. On Sierra there seem to be less.

So that's a tight limit for all third-party kexts. Is it possible for HAXM to not create any device node at all? I'm afraid we would have to come up with a very innovative idea to get rid of the global device node, /dev/HAX, which is created as soon as intelhaxm.kext is loaded, before any VM is launched. But then, HAXM also creates a few nodes per VM: that's 1 VM node (e.g. /dev/hax_vm/vm01) plus 1 or more VCPU nodes (e.g. /dev/hax_vm01/vcpu01), and I'm not sure if the subdirectory nodes (hax_vm, etc.) also consume the "device slot" quota. These VM and VCPU nodes (essentially file descriptors) don't really have to be created in /dev which is shared by and visible to the entire system; instead, they actually belong to the QEMU process (or whatever process that asked HAXM to create the VM) and should only consume the resources (i.e. file descriptors) of that process.

Maybe we should plan to get rid of the VM and VCPU device nodes first. Even that would require a significant change to both HAXM and QEMU (especially when we consider backward compatibility) as well as knowledge about relevant macOS kernel APIs (e.g. to create a special file that is private to the current process and return its fd), as @Taogle2018 used to point out:

https://issuetracker.google.com/issues/62395878#comment20

Any suggestions are welcome.

Taogle2018 commented 6 years ago

I would agree to remove VM and VCPU nodes first. But I really do not have knowledge on mac. Does it have something similar compared with linux anonymous inode?

raphaelning commented 6 years ago

Hmm, I just booted up 5 Linux guests with 8 vCPUs each on macOS High Sierra, and saw all of the 45 /dev/hax*/* device nodes created. So maybe I misunderstood the limit on "device slots"? Is that not the maximum number of files that can be created in /dev/?

feffi commented 6 years ago

same problem here:

log show --style syslog --predicate 'eventMessage CONTAINS "hax"' --last 5m
Filtering the log data using "eventMessage CONTAINS "hax""
Skipping info and debug messages, pass --info and/or --debug to include.
Timestamp                       (process)[PID]
2018-03-22 11:54:37.658619+0100  localhost kernel[0]: (intelhaxm) haxm_error: -------- HAXM release 6.2.1 --------
2018-03-22 11:54:37.658622+0100  localhost kernel[0]: (intelhaxm) haxm_error: This log collects running status of HAXM driver.
2018-03-22 11:54:37.752739+0100  localhost kernel[0]: (intelhaxm) haxm_error: Unknown ioctl 0x20007461, pid=250 ('haxm_configure')
feffi commented 6 years ago

and with debug:

log show --style syslog --predicate 'eventMessage CONTAINS "hax"' --last 50m --debug
Filtering the log data using "eventMessage CONTAINS "hax""
Skipping info messages, pass --info to include.
Timestamp                       (process)[PID]
2018-03-22 11:54:37.658619+0100  localhost kernel[0]: (intelhaxm) haxm_error: -------- HAXM release 6.2.1 --------
2018-03-22 11:54:37.658622+0100  localhost kernel[0]: (intelhaxm) haxm_error: This log collects running status of HAXM driver.
2018-03-22 11:54:37.752739+0100  localhost kernel[0]: (intelhaxm) haxm_error: Unknown ioctl 0x20007461, pid=250 ('haxm_configure')
2018-03-22 11:58:38.036193+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] entering /Library/Extensions/intelhaxm.kext
2018-03-22 11:58:38.036250+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] entering /Library/Extensions/intelhaxm.kext/Contents
2018-03-22 11:58:38.036278+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] leaving /Library/Extensions/intelhaxm.kext/Contents
2018-03-22 11:58:38.036301+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] leaving /Library/Extensions/intelhaxm.kext
2018-03-22 11:58:38.036435+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm,0x0,0x1b6) = 4
2018-03-22 11:58:38.036475+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm,0x0,0x1b6) = 5
2018-03-22 11:58:38.036595+0100  localhost kextd[52]: (Security) [com.apple.securityd:machorep] 9647 signing bytes in 3 blob(s) from /Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm(x86_64)
2018-03-22 11:58:38.036645+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeRequirements-1 error=-10
2018-03-22 11:58:38.043912+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/Extensions/intelhaxm.kext/Contents/Info.plist,0x0,0x1b6) = 4
2018-03-22 11:58:38.044040+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeTopDirectory error=-10
2018-03-22 11:58:38.044190+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeEntitlements error=-10
2018-03-22 11:58:38.044602+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] entering /Library/StagedExtensions/Library/Extensions/intelhaxm.kext
2018-03-22 11:58:38.044724+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] entering /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents
2018-03-22 11:58:38.044765+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] leaving /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents
2018-03-22 11:58:38.044790+0100  localhost kextd[52]: (Security) [com.apple.securityd:dirval] leaving /Library/StagedExtensions/Library/Extensions/intelhaxm.kext
2018-03-22 11:58:38.044964+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm,0x0,0x1b6) = 4
2018-03-22 11:58:38.045009+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm,0x0,0x1b6) = 5
2018-03-22 11:58:38.045130+0100  localhost kextd[52]: (Security) [com.apple.securityd:machorep] 9647 signing bytes in 3 blob(s) from /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/MacOS/intelhaxm(x86_64)
2018-03-22 11:58:38.045182+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeRequirements-1 error=-10
2018-03-22 11:58:38.054115+0100  localhost kextd[52]: (Security) [com.apple.securityd:unixio] open(/Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/Info.plist,0x0,0x1b6) = 4
2018-03-22 11:58:38.054243+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeTopDirectory error=-10
2018-03-22 11:58:38.054392+0100  localhost kextd[52]: (Security) [com.apple.securityd:cfloadfile] failed to fetch /Library/StagedExtensions/Library/Extensions/intelhaxm.kext/Contents/_CodeSignature/CodeEntitlements error=-10