profzei / Matebook-X-Pro-2018

💻 Latest macOS on Huawei Matebook X Pro 2018
Apache License 2.0
377 stars 48 forks source link

[PANIC] Random kernel panic causing restart #142

Closed raveltan closed 3 years ago

raveltan commented 3 years ago

System Information:

The system would restart randomly because of this error The problem also exists on Catalina

Stacktrace:

**panic(cpu 2 caller 0xffffff801e586d06): nvme: "Fatal error occurred. CSTS=0x1 US[1]=0x0 US[0]=0x16c VID=0x144d DID=0xa808
. FW Revision=EXA7301Q\n"@/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/IONVMeFamily/IONVMeFamily-557.60.1/Common/IONVMeController.cpp:5472
Backtrace (CPU 2), Frame : Return Address
0xffffffa0ab663960 : 0xffffff801bebab4d 
0xffffffa0ab6639b0 : 0xffffff801bffd7e3 
0xffffffa0ab6639f0 : 0xffffff801bfede1a 
0xffffffa0ab663a40 : 0xffffff801be5fa2f 
0xffffffa0ab663a60 : 0xffffff801beba3ed 
0xffffffa0ab663b80 : 0xffffff801beba6d8 
0xffffffa0ab663bf0 : 0xffffff801c6bef9a 
0xffffffa0ab663c60 : 0xffffff801e586d06 
0xffffffa0ab663c80 : 0xffffff801e56b427 
0xffffffa0ab663de0 : 0xffffff801c61d385 
0xffffffa0ab663e50 : 0xffffff801c61d286 
0xffffffa0ab663e80 : 0xffffff801beff725 
0xffffffa0ab663ef0 : 0xffffff801bf00634 
0xffffffa0ab663fa0 : 0xffffff801be5f13e 
      Kernel Extensions in backtrace:
         com.apple.iokit.IONVMeFamily(2.1)[D5DFC80E-EF7A-3660-BE57-473E67626B44]@0xffffff801e564000->0xffffff801e58dfff
            dependency: com.apple.driver.AppleEFINVRAM(2.1)[D6C13E44-3657-3F40-99E4-355DAA82202E]@0xffffff801d25a000->0xffffff801d263fff
            dependency: com.apple.driver.AppleMobileFileIntegrity(1.0.5)[2A454117-CDAA-301F-B609-BA396742C91A]@0xffffff801d401000->0xffffff801d415fff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[BF2C5E86-1E8F-3FD4-9874-7738178FA73B]@0xffffff801e81f000->0xffffff801e846fff
            dependency: com.apple.iokit.IOReportFamily(47)[D3C4FAA4-8F06-3C5C-AB36-4BE632CCE051]@0xffffff801e855000->0xffffff801e857fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[B5300908-BF34-3D47-8776-FB154A6DEE4C]@0xffffff801e93f000->0xffffff801e950fff

Process name corresponding to current thread: kernel_task
Boot args: -igfxnorpsc=1

Mac OS version:
20D64

Kernel version:
Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64
Kernel UUID: C86236B2-4976-3542-80CA-74A6B8B4BA03
KernelCache slide: 0x000000001bc00000
KernelCache base:  0xffffff801be00000
Kernel slide:      0x000000001bc10000
Kernel text base:  0xffffff801be10000
__HIB  text base: 0xffffff801bd00000
System model name: MacBookPro14,1 (Mac-B4831CEBD52A0C4C)
System shutdown begun: NO
Panic diags file available: NO (0xe00002bc)
Hibernation exit count: 0

System uptime in nanoseconds: 110468782406
Last Sleep:           absolute           base_tsc          base_nano
  Uptime  : 0x00000019b873dc3b
  Sleep   : 0x0000000000000000 0x0000000000000000 0x0000000000000000
  Wake    : 0x0000000000000000 0x0000000e65b74422 0x0000000000000000

**
profzei commented 3 years ago

@raveltan Is your macOS on internal or external NVMe SSD? Which is your SSD vendor? Have you tried not loading NVMeFix.kext? Have you tried a clean re-install?

raveltan commented 3 years ago

@profzei

My MacOS is located on an external SSD (Samsung T5) and for my internal drive it's PM891 (I know that this ssd may cause problem when Mac OS access it's file, but for the current case the filesystem is ext4 and I never access the file inside here, and for some reason I'm not able to unmount it).

I've also tried unloading NVMEFix.kext from the EFI partition but it ends up creating bootloop.

As for re-install, I've tried to do clean reinstall and the result it the same.

Is there any possibility that the ssd is faulty?

profzei commented 3 years ago

Is there any possibility that the ssd is faulty?

Maybe, but have you checked its status for example with gparted or similar tools with a usb-live Linux distro?

raveltan commented 3 years ago

After checking with gnome disks, it says that the disks is completely healty.

Is there anything that i can try to fix the problem?

profzei commented 3 years ago

Are you using thunderbolt port?

raveltan commented 3 years ago

Does it mean like to plug in thunderbolt compatible device?

If yes then I'm not using it. I'm using the bottom usb c slot for the exteral ssd.

profzei commented 3 years ago

No what I mean is if you are using the thunderbolt port which has a usb-c connector...

It seems yes... therefore you need to enable thunderbolt stuff: see the related info in homepage otherwise transfer protocol should be capped...

raveltan commented 3 years ago

Ah i see, I'll try it out, thanks a lot for the help

jonescamilla commented 3 years ago

I am also using a thundebolt device and getting the same panic. When referring to your guide for enabling the thunderbolt controller I was able to complete the first two tasks:

disable SSDT-DTB3.aml enable all SSDT-TB-DSB*.aml

But I am uncertain on where I could find the binary patches referenced below:

enable TB3: _GPE.NTFY,1,S to XTFY binary patch enable TB3: RP9._INI,0,N to XINI binary patch

I performed a quick keyword search on both references that you provided and briefly traversed the document and found little direction to a binary patch.

I'd appreciate the help, thank you!

profzei commented 3 years ago

@jonescamilla It's very simple! In the config.plist:

<dict>
                <key>Comment</key>
                <string>TB3: _GPE.NTFY,1,S to XTFY</string>
                <key>Count</key>
                <integer>0</integer>
                <key>Enabled</key>
                <false/>
                <key>Find</key>
                <data>TlRGWQk=</data>
                <key>Limit</key>
                <integer>0</integer>
                <key>Mask</key>
                <data></data>
                <key>OemTableId</key>
                <data></data>
                <key>Replace</key>
                <data>WFRGWQk=</data>
                <key>ReplaceMask</key>
                <data></data>
                <key>Skip</key>
                <integer>0</integer>
                <key>TableLength</key>
                <integer>0</integer>
                <key>TableSignature</key>
                <data></data>
            </dict>
            <dict>
                <key>Comment</key>
                <string>TB3: RP9._INI,0,N to XINI,0,N for disabling ICM</string>
                <key>Count</key>
                <integer>0</integer>
                <key>Enabled</key>
                <false/>
                <key>Find</key>
                <data>X0lOSQBwTFRSOQ==</data>
                <key>Limit</key>
                <integer>0</integer>
                <key>Mask</key>
                <data></data>
                <key>OemTableId</key>
                <data></data>
                <key>Replace</key>
                <data>WElOSQBwTFRSOQ==</data>
                <key>ReplaceMask</key>
                <data></data>
                <key>Skip</key>
                <integer>0</integer>
                <key>TableLength</key>
                <integer>0</integer>
                <key>TableSignature</key>
                <data></data>
            </dict>

You only need to change value from false to true for Enabled key

jonescamilla commented 3 years ago

Damn! I didn't even look to find the name in the config.plist! I was looking for it in the OC dir and subdirs.

Thank you!!

raveltan commented 3 years ago

Hello @jonescamilla, after applying changes, do you get the panic to disapper?

@profzei actually, i haven't tried the thunderbolt changes yet, but if i use the upper usb c for the mac os disk, it seems that it is also crashes, does this have anything to do with undervolting?

jonescamilla commented 3 years ago

Hello @jonescamilla, after applying changes, do you get the panic to disapper?

This repository is absolutely amazing for what it's able to accomplish and in all honesty I don't think I have the capability to properly give it credit when I report an issue or comment on it.

The panics did disappear for about a week but have resumed. THOUGH I believe these are due to sleep issues because they occur ever boot up from off BUT I cannot comment enough on it because I just ignore the crash menus when I'm trying to get work done.

The crashes when actively using the computer have greatly lessened but are still present. They seem to occur when I accidently write an infinite loop in some of my code or when a web browser is using too many resources.

I'll comment again when I get the next crash/panic.

(As a comment to your comment about undervolting, I have received panics for similar jobs/tasks w/ and w/o undervolting.)

raveltan commented 3 years ago

@jonescamilla Ah i see, thanks for the info

jonescamilla commented 3 years ago

I saved the third panic since my last comment:

panic(cpu 0 caller 0xffffff800f186d06): nvme: "Fatal error occurred. CSTS=0xffffffff US[1]=0x0 US[0]=0x141 VID=0x144d DID=0xa808
. FW Revision=2B2QEXM7\n"@/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/IONVMeFamily/IONVMeFamily-557.60.1/Common/IONVMeController.cpp:5472
Backtrace (CPU 0), Frame : Return Address
0xffffffa09bd2b960 : 0xffffff800cabab4d 
0xffffffa09bd2b9b0 : 0xffffff800cbfd7e3 
0xffffffa09bd2b9f0 : 0xffffff800cbede1a 
0xffffffa09bd2ba40 : 0xffffff800ca5fa2f 
0xffffffa09bd2ba60 : 0xffffff800caba3ed 
0xffffffa09bd2bb80 : 0xffffff800caba6d8 
0xffffffa09bd2bbf0 : 0xffffff800d2bef9a 
0xffffffa09bd2bc60 : 0xffffff800f186d06 
0xffffffa09bd2bc80 : 0xffffff800f16b427 
0xffffffa09bd2bde0 : 0xffffff800d21d385 
0xffffffa09bd2be50 : 0xffffff800d21d286 
0xffffffa09bd2be80 : 0xffffff800caff725 
0xffffffa09bd2bef0 : 0xffffff800cb00634 
0xffffffa09bd2bfa0 : 0xffffff800ca5f13e 
      Kernel Extensions in backtrace:
         com.apple.iokit.IONVMeFamily(2.1)[D5DFC80E-EF7A-3660-BE57-473E67626B44]@0xffffff800f164000->0xffffff800f18dfff
            dependency: com.apple.driver.AppleEFINVRAM(2.1)[D6C13E44-3657-3F40-99E4-355DAA82202E]@0xffffff800de5a000->0xffffff800de63fff
            dependency: com.apple.driver.AppleMobileFileIntegrity(1.0.5)[2A454117-CDAA-301F-B609-BA396742C91A]@0xffffff800e001000->0xffffff800e015fff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[BF2C5E86-1E8F-3FD4-9874-7738178FA73B]@0xffffff800f41f000->0xffffff800f446fff
            dependency: com.apple.iokit.IOReportFamily(47)[D3C4FAA4-8F06-3C5C-AB36-4BE632CCE051]@0xffffff800f455000->0xffffff800f457fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[B5300908-BF34-3D47-8776-FB154A6DEE4C]@0xffffff800f53f000->0xffffff800f550fff

Process name corresponding to current thread: kernel_task
Boot args: -igfxnorpsc=1

Mac OS version:
20D74

Kernel version:
Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64
Kernel UUID: C86236B2-4976-3542-80CA-74A6B8B4BA03
KernelCache slide: 0x000000000c800000
KernelCache base:  0xffffff800ca00000
Kernel slide:      0x000000000c810000
Kernel text base:  0xffffff800ca10000
__HIB  text base: 0xffffff800c900000
System model name: MacBookPro14,1 (Mac-B4831CEBD52A0C4C)
System shutdown begun: NO
Panic diags file available: YES (0x0)
Hibernation exit count: 0

System uptime in nanoseconds: 390493437112
Last Sleep:           absolute           base_tsc          base_nano
  Uptime  : 0x0000005aeb3908c5
  Sleep   : 0x0000000000000000 0x0000000000000000 0x0000000000000000
  Wake    : 0x0000000000000000 0x00000007122f1fe7 0x0000000000000000
profzei commented 3 years ago

@jonescamilla Thank you! in both cases the problem seems to be attributable to native IONVMeFamily.kext and its management with an external disk (both of you have done a macOS install on external disk if I remember well...) What could be wrong for this setup?

So, @jonescamilla what is your internal ssd? Is it the same as @raveltan one? i.e. a Samsung PM981?

@raveltan Samsung PM981 or Plus 970 ssd seem to be troublesome for macOS... maybe even if macOs system is not installed on it... I'm not sure about last sentence but we could try to focus our attention on this since other options lacked...

I tried to dig some more into Samsung PM981 and related stuff and you (@raveltan ) could try to

<dict>
        <key>Arch</key>
        <string>x86_64</string>
        <key>BundlePath</key>
        <string>HackrNVMeFamily.kext</string>
        <key>Comment</key>
        <string>Hacking for NVMe</string>
        <key>Enabled</key>
        <true/>
        <key>ExecutablePath</key>
        <string>Contents/MacOS/HackrNVMeFamily</string>
        <key>MaxKernel</key>
        <string></string>
        <key>MinKernel</key>
        <string></string>
        <key>PlistPath</key>
        <string>Contents/Info.plist</string>
</dict>

SSDT-NVME.aml.zip HackrNVMeFamily.kext.zip

jonescamilla commented 3 years ago

I run macOS as my only OS on an internal Samsung SSD 970 EVO Plus drive. I also had no idea that the 970 Plus has been troublesome for macOS. I purchased the 970 EVO because many people recommended it in forums for Hackintoshes.

raveltan commented 3 years ago

Ah i see, Thanks a lot for the help guys! I'll update you as soon as i try the changes. I've used the the disk for something else , so it may took sometimes for me to reinstall.

profzei commented 3 years ago

@jonescamilla One of the sources for my consideration is Dortania's guide: maybe you have latest firmware for your ssd, but as a general rule Samsung ssds should be avoided just to be safe! Please check your firmware status or anything else related to Samsung real compatibility... maybe we need to set in NVMeFix.kext some parameters like latency for your model but now I don't remember well...

jonescamilla commented 3 years ago

I have updated the ssd firmware. I was unable to install macOS until I updated the firmware.

RoderickQiu commented 3 years ago

Hello, I should say that this problem persists, even when I'm having a TOSHIBA inner SSD. I'm using first generation MBXP, and has migrated to EFI v2.1.0 version (only that I've using itlwm to replace AirportItlwm). I've having the same panic related to IONVMEFamily.kext and even persists after I've followed this issue to add HackrNVMeFamily.kext.

panic(cpu 2 caller 0xffffff8006f432e1): nvme: "Fatal error occurred. ID=0xffffffff ARG1=0xffffffff ARG2=0xffffffff ARG3=0xffffffff EDD0=0xffffffff EDD1=0xffffffff EDD2=0xffffffff EDD3=0xffffffff EDD4=0xffffffff EDD5=0xffffffff EDD6=0xffffffff EDD7=0xffffffff. NAND Vendor=0x0, DRAM Vendor=0x0, SSD Capacity=0GB, FW Revision=AAXA4103\n"@/BuildRoot/Library/Caches/com.apple.xbs/Sources/IONVMeFamily/IONVMeFamily-234.60.2/IONVMeController.cpp:4949 Backtrace (CPU 2), Frame : Return Address 0xffffffa06a44b990 : 0xffffff80020bc66d 0xffffffa06a44b9e0 : 0xffffff80021ff073 0xffffffa06a44ba20 : 0xffffff80021ef6aa 0xffffffa06a44ba70 : 0xffffff8002061a2f 0xffffffa06a44ba90 : 0xffffff80020bbf0d 0xffffffa06a44bbb0 : 0xffffff80020bc1f8 0xffffffa06a44bc20 : 0xffffff80028bee1a 0xffffffa06a44bc90 : 0xffffff8006f432e1 0xffffffa06a44bde0 : 0xffffff800281d5c5 0xffffffa06a44be50 : 0xffffff800281d4c6 0xffffffa06a44be80 : 0xffffff8002101345 0xffffffa06a44bef0 : 0xffffff8002102254 0xffffffa06a44bfa0 : 0xffffff800206113e Kernel Extensions in backtrace: com.apple.hack.HackrNVMeFamily(92.1)[12D4270E-AFFC-34DC-9714-44D1FE33333F]@0xffffff8006f33000->0xffffff8006f72fff dependency: com.apple.driver.AppleEFINVRAM(2.1)[78808055-9D80-3318-8BEE-4C545178A586]@0xffffff8003455000->0xffffff800345efff dependency: com.apple.driver.AppleMobileFileIntegrity(1.0.5)[37AC6FB3-4CB3-3E1A-981C-48A212712E57]@0xffffff80035fc000->0xffffff8003610fff dependency: com.apple.iokit.IOPCIFamily(2.9)[A18ACD60-A811-3624-B50D-4F929836EE79]@0xffffff8004a17000->0xffffff8004a3efff dependency: com.apple.iokit.IOReportFamily(47)[0EC55CCD-966C-33F4-9B8A-1E9CB2778AE7]@0xffffff8004a4d000->0xffffff8004a4ffff depen

profzei commented 3 years ago

@RoderickQiu @raveltan I was digging more into NVMeFix & PM981 when I read last update with this issue occurring even with Toshiba NVMe... Now I'm out of ideas...

jonescamilla commented 3 years ago

I'm currently unable to use my computer for longer than 10-15 minutes without it crashing. I'm only running iTerm and Safari and connected to my monitor. Matebook will still crash if left in login screen. Resetting nvram had no affect. Will be "resetting" EFI next ("Resetting" the EFI had actually temporarily solved the crashing in the past). Crashing does not persist when not connected to monitor (thunderbolt).

Last Error message:

panic(cpu 0 caller 0xffffff8005386d06): nvme: "Fatal error occurred. CSTS=0xffffffff US[1]=0x0 US[0]=0x147 VID=0x144d DID=0xa808
. FW Revision=2B2QEXM7\n"@/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/IONVMeFamily/IONVMeFamily-557.60.1/Common/IONVMeController.cpp:5472
Backtrace (CPU 0), Frame : Return Address
0xffffffa091efb960 : 0xffffff8002cbab4d 
0xffffffa091efb9b0 : 0xffffff8002dfd7e3 
0xffffffa091efb9f0 : 0xffffff8002dede1a 
0xffffffa091efba40 : 0xffffff8002c5fa2f 
0xffffffa091efba60 : 0xffffff8002cba3ed 
0xffffffa091efbb80 : 0xffffff8002cba6d8 
0xffffffa091efbbf0 : 0xffffff80034bef9a 
0xffffffa091efbc60 : 0xffffff8005386d06 
0xffffffa091efbc80 : 0xffffff800536b427 
0xffffffa091efbde0 : 0xffffff800341d385 
0xffffffa091efbe50 : 0xffffff800341d286 
0xffffffa091efbe80 : 0xffffff8002cff725 
0xffffffa091efbef0 : 0xffffff8002d00634 
0xffffffa091efbfa0 : 0xffffff8002c5f13e 
      Kernel Extensions in backtrace:
         com.apple.iokit.IONVMeFamily(2.1)[D5DFC80E-EF7A-3660-BE57-473E67626B44]@0xffffff8005364000->0xffffff800538dfff
            dependency: com.apple.driver.AppleEFINVRAM(2.1)[D6C13E44-3657-3F40-99E4-355DAA82202E]@0xffffff800405a000->0xffffff8004063fff
            dependency: com.apple.driver.AppleMobileFileIntegrity(1.0.5)[2A454117-CDAA-301F-B609-BA396742C91A]@0xffffff8004201000->0xffffff8004215fff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[BF2C5E86-1E8F-3FD4-9874-7738178FA73B]@0xffffff800561f000->0xffffff8005646fff
            dependency: com.apple.iokit.IOReportFamily(47)[D3C4FAA4-8F06-3C5C-AB36-4BE632CCE051]@0xffffff8005655000->0xffffff8005657fff
            dependency: com.apple.iokit.IOStorageFamily(2.1)[B5300908-BF34-3D47-8776-FB154A6DEE4C]@0xffffff800573f000->0xffffff8005750fff

Process name corresponding to current thread: kernel_task
Boot args: -igfxnorpsc=1

Mac OS version:
20D74

Kernel version:
Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64
Kernel UUID: C86236B2-4976-3542-80CA-74A6B8B4BA03
KernelCache slide: 0x0000000002a00000
KernelCache base:  0xffffff8002c00000
Kernel slide:      0x0000000002a10000
Kernel text base:  0xffffff8002c10000
__HIB  text base: 0xffffff8002b00000
System model name: MacBookPro14,1 (Mac-B4831CEBD52A0C4C)
System shutdown begun: NO
Panic diags file available: YES (0x0)
Hibernation exit count: 0

System uptime in nanoseconds: 215409037388
Last Sleep:           absolute           base_tsc          base_nano
  Uptime  : 0x0000003227611d05
  Sleep   : 0x0000000000000000 0x0000000000000000 0x0000000000000000
  Wake    : 0x0000000000000000 0x0000000bac7ab2f6 0x0000000000000000

1 day after:

No longer crashing every boot ¯\(ツ)/¯ ... (still running 1.9.1)

profzei commented 3 years ago

@jonescamilla As I said I'm really out of ideas/options since I'm not able to figure out the correlation from a kext (IONVMeFamily) which should manage NVMe I/O processes with an external monitor attached to a thunderbolt port (which should be working fine for 10-15 minutes since, as you reported, the issue appeared after a such time interval)... Sigh!

samwzlim commented 3 years ago

@profzei

My MacOS is located on an external SSD (Samsung T5)

and for my internal drive it's PM891 (I know that this ssd may cause problem when Mac OS access it's file, but for the current case the filesystem is ext4 and I never access the file inside here, and for some reason I'm not able to unmount it).

I've also tried unloading NVMEFix.kext from the EFI partition but it ends up creating bootloop.

As for re-install, I've tried to do clean reinstall and the result it the same.

Is there any possibility that the ssd is faulty?

Hey, may I know what you did to solve the bootloop issue after unloading NVMeFix.kext? I was having kernel panics as well, but after trying to solve it by unloading NVMeFix.kext, I am encountering bootloops. The apple logo is present but the loading bar doesn't appear and the machine never starts.