acidanthera / bugtracker

Acidanthera Bugtracker
384 stars 44 forks source link

OpenCore: Kernel panic in `_performEfiCallAsm` on Intel DX79SR #791

Closed otherwhaler closed 4 years ago

otherwhaler commented 4 years ago

I have an Intel DX79SR-based system that I'm attempting to convert to OpenCore. The system does boot successfully with Clover, VirtualSMC, and Lilu. However, the DX79SR is old and has a very rudimentary UEFI implementation. It does have NVRAM, but it does not seem to be properly available to macOS. I need to use EmuVariableUefi.efi to boot successfully with Clover.

With OpenCore + VirtualSMC + Lilu, I always get a kernel panic in _performEfiCallAsm shortly after boot, regardless of whether I have LegacyEnable in OpenCore's config.plist turned on or not. I had also tried using EmuVariableUefi.efi with OpenCore, but I believe it had the same result.

I also get an "NVRAM is full, cannot log!" message as OpenCore is starting, which may or may not be related.

It seems like OpenCore (or one of the kexts) may be trying to read from or write to this board's unreliable NVRAM even though LegacyEnable is set, but I could be wrong. I'm posting here in case it is an OpenCore bug. If not, I'll dig around further to try to figure out what Clover and EmuVariableUefi are doing differently. Thanks!

Attached are my config.plist and the stack trace from the panic.

OC Kernel Panic _performEfiCallAsm config-redacted.txt

vit9696 commented 4 years ago

The kernel panic happens, because you try to call GetVariable service, and OpenCore does not disallow you reading NVRAM variables, only writing (when DisableVariableWrite is on). Also, since this is Haswell, I do not think you can use DevirtualiseMmio without whitelisting regions. Try just disabling it for the time being.

LuanSala commented 4 years ago

Hi, I came here to open an issue and that was already opened. So, I'm relating my case here to not open another issue.

I'm also having this kernel panic on my laptop.

I'm also getting the message NVRAM is full, cannot log! like mentioned before.

My configuration:

I think my laptop have the memory protection

Booter quirks list:

<key>AvoidRuntimeDefrag</key>
<true/>
<key>DevirtualiseMmio</key>
<true/>
<key>DisableSingleUser</key>
<false/>
<key>DisableVariableWrite</key>
<false/>
<key>DiscardHibernateMap</key>
<false/>
<key>EnableSafeModeSlide</key>
<false/>
<key>EnableWriteUnprotector</key>
<true/>
<key>ForceExitBootServices</key>
<false/>
<key>ProtectCsmRegion</key>
<false/>
<key>ProtectSecureBoot</key>
<false/>
<key>ProtectUefiServices</key>
<true/>
<key>ProvideCustomSlide</key>
<false/>
<key>SetupVirtualMap</key>
<false/>
<key>ShrinkMemoryMap</key>
<false/>
<key>SignalAppleOS</key>
<false/>

If I enable the quirk SetupVirtualMap the boot.efi hangs on the following phrase OCSMC: SmcReadValue Key ...

vit9696 commented 4 years ago

Could you please recheck with OpenRuntime from master? Also, always attach EFI and debug log.

LuanSala commented 4 years ago

Yeah of course. But I don't know how to generate that file. Could you please show me the right directions? (I don't have a mac with me, just a Catalina 10.15.3 on VMWare)

EFI and debug log attached below. EFI-057Debug.zip opencore-2020-03-29-152835.txt

LuanSala commented 4 years ago

Ok, if I did the things right (I followed the section 3.3 of Configuration.pdf file from OpenCore), unfortunately with OpenRuntime from master the same kernel panic occurred.

OpenRuntime.efi.zip

otherwhaler commented 4 years ago
  • Why is the board NVRAM considered unreliable? Does NVRAM work in Windows and Linux?

Windows 10 and the Arch live image do boot successfully in UEFI mode. I wasn't able to find a good way to test NVRAM in Windows. In Arch my testing was somewhat inconclusive. I was able to see many NVRAM variables using efivar. Interestingly, this includes the OpenCore version variable, so that implies it's able to at least write something to NVRAM. However, when I tried to write a variable using efivar I got an Input/Output Error. I could be doing something wrong because the documentation for efivar is not very clear, so I'll give it another shot.

  • Did you try to get rid with EmuVariableRuntime before?

Yes. I'll try it again so that I can say what the exact behavior is.

  • You have DisableVariableWrite, does not this kind of explain the NVRAM is "full" error?

Removed that setting; I still get the NVRAM is full, cannot log! message.

Also, since this is Haswell, I do not think you can use DevirtualiseMmio without whitelisting regions. Try just disabling it for the time being.

Disabled that for the time being; same result. (Also, this is actually a Sandy Bridge-E i7-3930k, so even older and probably less well-supported.)

vit9696 commented 4 years ago

You both have bit 0x10 set in Misc -> Target, which enables UEFI variable logging. Remove this (e.g. use 0x3), clean NVRAM afterwards, and retry. It is very likely that your NVRAM implementation does not do garbage collection. In this case your only choice might SPI flasher.

LuanSala commented 4 years ago

What a bad situation. :cry:

So, I altered the Target flags but nothing changed.

About SPI flasher, I didn't understand very well. Need I just flash the same bios version already installed? Why this process should help in the problem correction?

otherwhaler commented 4 years ago
  • Did you try to get rid with EmuVariableRuntime before?

Yes. I'll try it again so that I can say what the exact behavior is.

Quick results of this test:

Should EmuVariableUefi be able to affect OpenCore at all, or does OpenCore interact with NVRAM in another way that makes it irrelevant?

vit9696 commented 4 years ago

So, firstly regarding EmuVariableUefi. EmuVariableUefi bundled with Clover is not a standalone driver you can just take and use with OpenCore or anything else. It installs a protocol, which Clover bootloader calls to initialise EmuVariableUefi when macOS is booted. Therefore it is expected and correct that EmuVariableUefi just does not work with OpenCore.

Secondly, regarding NVRAM is full, cannot log!. This issue should disappear as soon as Target is changed to 3 and NVRAM is reset via OpenCore. Can you confirm this?

Thirdly, it will help to see the up to date problem report:

I would rather look at fresh results before advising anything.

Fourthly, regarding SPI flasher. Basically if NVRAM driver in your firmware does not reclaim empty space (when you delete an NVRAM variable the driver needs to mark the area as unused and defragment the storage), then your only option is to do that manually by taking a dump of your ROM, hex-editing it, and flashing back. This issue happened mainly with old firmwares, MacPro5,1 in particular, but also a large number of laptops.

LuanSala commented 4 years ago

OpenCore 0.5.7 _ 2020-03-30 EFI.zip opencore-2020-03-31-150521.txt


An interesting point (or not... :grimacing: )

My laptop came with Windows 10 installed on it, but I use Linux. So, I tried to install the Linux distribution which I like (Fedora), but I couldn't boot the installer. I tried boot Fedora 31 and 30. After this I tried Ubuntu 19.10 and it was possible to install.

And now, I tried to boot Fedora 32 beta and I could open the installer normally. Maybe there were changes in the kernel code (kernel version booted - 5.6.0-0.rc5.git0.2.fc32.x86_64) which boot Fedora installer was possible to open. Just speculating.

vit9696 commented 4 years ago

It looks really strange. I see slide=0 in your config, but the screenshot has no such bot argument. Other than that I am starting to suspect https://github.com/acidanthera/bugtracker/issues/491 to be relevant. If you replace OpenRuntime.efi with FwRuntimeServices from 0.5.6, will it change anything?

LuanSala commented 4 years ago

Well, in my case changing OpenRuntime.efi to FwRuntimeServices.efi the same kernel panic occurred. opencore-2020-03-31-194923.txt

About the slide argument, I saw that but I thought this behavior was normal. lol

vit9696 commented 4 years ago

Right, slide=0 is hidden from the OS. Please remove that boot argument, however, it may make the systems unbootable.

LuanSala commented 4 years ago

I put the slide=0 before trying to resolve the panic. Now I just removed that boot argument. opencore-2020-04-01-113129.txt

otherwhaler commented 4 years ago

Here are my results after clearing NVRAM, turning off NVRAM logging and turning on file logging. There are several logs because the first few failed with ERROR allocating 0xa00 pages… errors (unfortunately also normal for this motherboard, hopefully I can find a CustomSlide setup to minimize it eventually, but I don't think it's related to this kernel panic). It took a few tries to get a boot that actually got into the OS. But when it did, it still ended up at the same kernel panic.

EFI+logs.zip

IMG_9810-web

Hardware:

vit9696 commented 4 years ago

Could you please try this version: https://github.com/acidanthera/bugtracker/issues/491#issuecomment-607826914? Will need logs.

LuanSala commented 4 years ago

same kernel panic occurred to me.

opencore-2020-04-02-150319.txt

Do I always need to attach a picture of the kernel panic here?

vit9696 commented 4 years ago

Thanks. Looks like we have several different issues piling here.

Do I always need to attach a picture of the kernel panic here?

Yes, sorry. While it may look similar, the nuances (like e.g. error code value) tell quite a lot about what exactly happened, so we need this information. I have a suspect that in your memory map our driver is not present as RT code, and if this is true, it is really really crazy.

Could you please do the following:

  1. Run Shell bypassing OpenCore, e.g. you can do this by saving it to a USB flash as EFI\BOOT\BOOTx64.efi and launching via BIOS boot menu.
  2. Save memmap to a text file, like this: memmap > fs0:\memmap_pre.txt.
  3. Load OpenRuntime, like this: load fs0:\EFI\OC\Drivers\OpenRuntime.efi, and take a pic of the load address it prints or just write that address down.
  4. Save new memmap to a text file, like this: memmap > fs0:\memmap_post.txt.

Post the memory maps and image load address here.

LuanSala commented 4 years ago

Thanks for your support

I loaded OpenRuntime from https://github.com/acidanthera/bugtracker/issues/491#issuecomment-607826914

memmap_pre.txt memmap_post.txt

Image 'fs0:\EFI\OC\Drivers\OpenRuntime.efi' loaded at 86491000 - Success

vit9696 commented 4 years ago

Indeed, I can confirm the problem. Geez… it is even worse than in the other thread.

otherwhaler commented 4 years ago

Here are my results using build 7 from https://github.com/acidanthera/bugtracker/issues/491#issuecomment-608147674:

Panic log (stitched together from a couple video frames, sorry about the ghosting—let me know if you need better images or video): panic-stitched

Boot log: opencore-2020-04-03-015935.txt

memmap testing: IMG_6514-edit

memmap-pre-v7.txt memmap-post-v7.txt

vit9696 commented 4 years ago

Please check whether this version boots for you:

https://github.com/acidanthera/bugtracker/issues/491#issuecomment-608480592

Note the latest changes in config.plist. You need to enable SyncRuntimePermissions and RebuildAppleMemoryMap. If this boots for you, try booting with EnableWriteUnprotector disabled.

LuanSala commented 4 years ago

:tada: That is an awesome work @vit9696 .

Either with EnableWriteUnprotector enabled or disabled macOS installer boots.

SyncRuntimePermissions + RebuildAppleMemoryMap + EnableWriteUnprotector = true

SyncRuntimePermissions + RebuildAppleMemoryMap = true ___ EnableWriteUnprotector = false

otherwhaler commented 4 years ago

That fails immediately before loading the OS for me, unfortunately.

opencore-2020-04-03-160418.txt config-mmap-r2.plist.txt

Looking through the change log, I don't think I'm missing any other changes for 0.5.7, although I might be. All of the other drivers are still at their 0.5.6 versions as well, in case that affects anything.

vit9696 commented 4 years ago

@LuanSala please ensure that you can boot fine with r5: https://github.com/acidanthera/bugtracker/issues/491#issuecomment-608618632. For your firmware you want this:

AvoidRuntimeDefrag - true
DevirtualiseMmio - true
DisableSingleUser - w/e
DisableVariableWrite - false
DiscardHibernateMap - false
EnableSafeModeSlide - true
EnableWriteUnprotector - false
ForceExitBootServices - false
ProtectCsmRegion - false
ProtectSecureBoot - false
ProtectUefiServices - true
ProvideCustomSlide - true
RebuildAppleMemoryMap - true
SetupVirtualMap - false
SignalAppleOS - false
SyncRuntimePermissions - true
vit9696 commented 4 years ago

@otherwhaler your firmware does not have MAT support, so you seem to have a different problem. Please try to boot with r5: https://github.com/acidanthera/bugtracker/issues/491#issuecomment-608618632. For your firmware you want this:

AvoidRuntimeDefrag - true
DevirtualiseMmio - false
DisableSingleUser - w/e
DisableVariableWrite - false
DiscardHibernateMap - false
EnableSafeModeSlide - true
EnableWriteUnprotector - true
ForceExitBootServices - true
ProtectCsmRegion - true
ProtectSecureBoot - false
ProtectUefiServices - false
ProvideCustomSlide - true
RebuildAppleMemoryMap - true
SetupVirtualMap - true
SignalAppleOS - false
SyncRuntimePermissions - true

If this still fails, please attach the boot log and the photo of the kernel panic. In addition to that please run the following test:

  1. Load UEFI Shell from the firmware bypassing OC (by e.g. storing it on USB flash under the name of EFI\BOOT\BOOTx64.efi and loading via BIOS boot menu)
  2. Type the following commands in the Shell:
    load fs0:\OpenRuntime.efi > fs0:\log1.txt
    fs0:\MmapDump.efi > fs0:\log2.txt

Note, fs0: will be your USB flash filesystem with the files from this archive: toolset-mmap-r5.zip. Upload the resulting logs to your message.

otherwhaler commented 4 years ago

That version also fails before getting to the OS. It just stops here: IMG_3257-edit

Boot attempt log: opencore-2020-04-03-202321.txt

Shell mmap logs: log1.txt log2.txt

vit9696 commented 4 years ago

I see the issue with r5. Fixed in master and attached here too:

OpenCore-mmap-r6.zip

otherwhaler commented 4 years ago

Unfortunately r6 still fails in about the same way as r5. The message onscreen is the same; the last couple lines of the boot log are slightly different:

opencore-2020-04-03-211701.txt

mmap logs again (the first one was run against the r6 OpenRuntime.efi in EFI/OC/Drivers): log3.txt log4.txt

vit9696 commented 4 years ago

A bit strange, but I think I see it. Please retry with this one (also committed to master):

[master 1f8a079] OcAppleBootCompatLib: Use the original GetMemoryMap for VM pool
 3 files changed, 12 insertions(+), 8 deletions(-)

OpenCore-mmap-r7.zip

LuanSala commented 4 years ago

@LuanSala please ensure that you can boot fine with r5: #491 (comment). For your firmware you want this: ...

@vit9696 Yeah, I can confirm, macOS boots fine with r5. :+1:

opencore-log.txt

otherwhaler commented 4 years ago

Okay, it's back to at least trying to boot the OS again. Still panics though.

Panic log: panic-mmap-r7

Boot log: opencore-2020-04-04-021403.txt

mmap logs: log5.txt log6.txt

vit9696 commented 4 years ago

Ok, I see the issue now, and actually it is very different from @LuanSala, so I got really confused. You have MMIO regions marked as reserved for some reason:

OCMM: Reserved  [RUN|   |  |  |  |  |  |  |  |   |  |  |  |  ] 0x00000000F00F8000-0x00000000F00F8FFF -> 0x0000000000000000 (4 KB)
OCMM: Reserved  [RUN|   |  |  |  |  |  |  |  |   |  |  |  |  ] 0x00000000FED1C000-0x00000000FED1FFFF -> 0x0000000000000000 (16 KB)
OCMM: Reserved  [RUN|   |  |  |  |  |  |  |  |   |  |  |  |  ] 0x00000000FF983000-0x00000000FF992FFF -> 0x0000000000000000 (64 KB)
OCMM: Reserved  [RUN|   |  |  |  |  |  |  |  |   |  |  |  |  ] 0x00000000FFFFF000-0x00000000FFFFFFFF -> 0x0000000000000000 (4 KB)

As a result boot.efi does not assign them a virtual address and the kernel dies when trying to access them via physical address. A proper fix will require config.plist update, but here is a hardcoded version of the fix to test:

OpenCore-mmap-r10.zip

otherwhaler commented 4 years ago

Progress! It no longer kernel panics! However, it does get stuck shortly after that point, with the last message usually regarding APFS load, which may or may not be related. Out of curiosity I tested this with both APFSDriverLoader.efi and with just apfs.efi, but no luck so far.

IMG_3864-edit

opencore-2020-04-04-151732.txt

log7.txt log8.txt

vit9696 commented 4 years ago

Well, the rest of the issues are part of your misconfiguration. For example, enable IgnoreInvalidFlexRatio. You may also want other kernel patches and such for your system.

vit9696 commented 4 years ago

Master has this quirk under the name of ProtectMemoryRegions. Enjoy =)

otherwhaler commented 4 years ago

Okay, will see what I can do about the other stuff. Thanks for your help!

Drone4zone commented 1 year ago

Hello Have same Itel Board DX79SR. Whisch I canot seem to get booted with OpenCore . Works solid with Clover running OSX Mojave. My goal is to have it upgraded to Monterey . It seems Something is stopping my progress and Ive tried now for weeks without getting into installer.

mikebeaton commented 1 year ago

I'm not aware of all the details which might be relevant here, but just to note that OpenCore now has its own NVRAM emulation driver, OpenVariableRuntimeDxe. Search for this in the OpenCore documentation for info on how to set it up. HTH.

Drone4zone commented 1 year ago

I have adjusted my drivers and now have kernel panic , can not get pass this to boot installer BigSir

OpenCore-dx79sr-BigSir-install.zip