osy / HaC-Mini

Intel NUC Hades Canyon Hackintosh support
MIT License
1.22k stars 159 forks source link

Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 #1

Closed osy closed 4 years ago

osy commented 5 years ago

After updating to 10.14.5 Beta 2, kernel panics on boot if GPU acceleration is enabled.

Crash is in AMDRadeonX4000 but replacing AMDRadeonX4000HWLibs with the one from Beta 1 resolves the issue.

Running bindiff and looking at the changes in HWLibs, we identify the following function changes (data only changes and constant operand changes are not identified by bindiff)

A number of functions have bzero added to clear the stack data before use:

_SW_SMUM_Dpm_SetMinDeepSleepDceFClk _SW_SMUM_Dpm_SetWorkloadPolicy _SW_SMUM_Dpm_DisableUclkFastSwitching _SW_SMUM_Dpm_EnableUclkFastSwitching _SW_SMUM_Dpm_GetCurrentDpm _SW_SMUM_Dpm_ForceDpmLevel _SW_SMUM_Dpm_SetClockLimit _SW_SMUM_Dpm_GetClockLimit

A couple of hardware specific changes may or may not be used by Polaris22 path:

_gc_9_1_init_gfx_power_gating _gc_10_1_get_gb_addr_config_default _greenland_update_hw_virtualization_settings _Cail_Bonaire_UpdateMultimediaClockGating _PhwCIslands_BugCheckRegisterDump (added register write) _PhwCIslands_Initialize _PhwPolaris10_Initialize

Some "suspicious" changes:

_CailIdentifyCrossDisplayAndXGP: check (ulong)((int )(lParm1 + 0x198) - 0x41U < 0x40) for "EnableXDSupport" _PEM_CWDDEPM_AdjustPowerOptimizationSettings: added blob of code _CAILFullResetSupport: new branch for (int )(lParm1 + 0x19c) - 0x41U < 0x40) _Cail_MCILQuerySystemInfo: old:0,1 new:0,2 AtiPowerPlayServices::ppInitialize(PPDisplayConfiguration ): call _PP_Initialize => now calls _PP_InitializeEX _CailReadinRegistryFlags: check (uint)(0x3f < (int *)(lParm1 + 0x198) - 0x41U) for "DisableFBCSupport"

Other changes that seem benign: _PEM_CWDDEPM_PMLogControl: removed _PECI_LockPowerPlayOnly/_PECI_UnlockPowerPlayOnly _PHM_CollectDbgInfo: added indirect call at end _CailSaveCailInitInfo: added a register copy _PEM_CWDDEPM_GetODDefaultPerformanceLevels: added assertion ZN20AtiPowerPlayServicesC2EP18PowerPlayCallbacks: assertion changes, maybe more ZN21AtiPowerPlayInterface25createPowerPlayServiceForEP18PowerPlayCallbacks: assertion changes __ZN25AtiApplePowerTuneServices23createPowerTuneServicesEP11PP_InstanceP18PowerPlayCallbacks: added navi support __ZN20AtiAppleMcilServices9obtainIriEPvP22_MCIL_IRI_OBTAIN_INPUTP23_MCIL_IRI_OBTAIN_OUTPUT: removed "ATY,CAIL_IRI"

Trying to individually patch out each change identified here and reverting the behaviour does not appear to fix the issue.

As a workaround, we load the Beta 1 HWLibs and it works normally but can break in a future macOS update.

osy commented 4 years ago

This issue is showing up in 10.15.1 and the workaround no longer works.

desert0616 commented 4 years ago

I am facing a very similar issue on my Macbook Pro 2015 (rx450). Even though the GPU on NUC8 is named vega but it should be of old architecture, so I think the it is a driver bug for Polaris.

jasanders commented 4 years ago

Will this change in 10.15.4 Beta 1 impact this bug? https://www.reddit.com/r/hackintosh/comments/ezr3e5/macos_10154_beta_1_gives_back_drm_to_polaris/

osy commented 4 years ago

Interesting, we’ll have to see.

osy commented 4 years ago

Figured out the issue. Polaris22_UploadSMUFirmwareImageDefault calls PECI_IsEarlySAMUInitEnabled to check if SMU firmware can be loaded directly. PECI_IsEarlySAMUInitEnabled looks at bit 0x160 of CAIL_DDI_CAPS_POLARIS22_A0 which should be 0. But it is 1, leading the firmware to not be loaded. Patching the function to return 0 will fix it.

It worked before by chance. AtiAppleCailServices::isAsicCapEnabled was updated to include Polaris22 settings. Previously it wasn't there so it defaulted to 0.