osy / Polaris22Fixup

Metal driver patches for Vega M
MIT License
47 stars 7 forks source link

Add kPECI_IsEarlySAMUInitEnabled Patch to support Monterey #16

Closed goodbest closed 3 years ago

goodbest commented 3 years ago

Tested under 10.15.7/11.5.1/12.0b4/12.0b5. Lilu 1.5.1 and above is needed. Thanks @dgsga @vit9696 and others for help. This should solve https://github.com/osy/Polaris22Fixup/issues/15

There're two things to fix in 12.0beta4.

Here is a compiled kext to test. Polaris22Fixup.kext.zip

osy commented 3 years ago

@goodbest I'm wondering if the binary patch is too fragile as the compiler often changes the register names and order of instructions (we saw that with the Metal patch, as it often got broken after a major update). However, as the number of bytes here is large, it may change in minor updates and/or beta updates. Any idea if this code is the same in other versions (10.15, 11.0, 11.5, etc)? If not, then it may be worth it to get hex dumps of the code from various different macOS versions then only finding the bytes that are the same in each update, and doing a masked search (ignore the bytes that change in updates).

goodbest commented 3 years ago

That piece of code to be patched has not been changed in 10.15.7/11.5/12.0b5 (not sure if it's the same in 11.0~11.2). Although there's still possibility that the binary code could change in future major updates, I think we don't have to be too worry about it.

Besides patching the function to return 0, as stated by your readme.md, we can also try to patch the 160bit of CAIL_DDI_CAPS_POLARIS22_A0 to 0.

However I found two kind of values in the kext. Can you point out which bit is exactly the 160th bit to patch? (due to Big/Little Endian)

osy commented 3 years ago

@goodbest I did some digging. _PECI_IsAsicCapEnabled calls AtiAppleCailServices::isAsicCapEnabled.

AtiAppleCailServices::isAsicCapEnabled calls AtiPowerPlayServices::getCallbacks which returns the PowerPlayCallbacks created by BaffinPowerPlayManager::populatePowerPlayCallbackStructure (found in AMD9500Controller.kext) which sets field 0x28 to 0.

Since field_0x28 is 0, AtiAppleCailServices::isAsicCapEnabled will call findDeviceAsicCaps which finds the right entry in the table matching the current card (CAIL_DDI_CAPS_POLARIS22_A0 no underscore at the beginning) and caches it in field_0x28.

The logic for finding the right bit is

    local_c = (uint)((*(uint *)(*(long *)&callbacks->field_0x28 + (param_2 >> 5) * 4) &
                     1 << ((byte)param_2 & 0x1f)) != 0);

which simplifies to

callbacks->field_0x28[(param_2 / 32)*4] & (1 << (param_2 % 32))

where field_0x28 is an int*.

This means we can patch byte 0x14 to be 0.

tl;dr:

Find symbol CAIL_DDI_CAPS_POLARIS22_A0 go to byte offset 0x2c and set it to 2 (one byte).

goodbest commented 3 years ago

According to @osy 's comment , in order to directly patch the 160th bit of CAIL_DDI_CAPS_POLARIS22_A0 to 0:

I've confirmed that it's working under 12.0b5. (I believe it's also working under 10.15/11.x)

As you don't have symbols for the CAPS value... in Monterey, binary patch is still needed.

static const uint8_t kCAIL_DDI_CAPS_POLARIS22_A0Original[] = {
    0x05, 0x00, 0x80, 0x00, 0xFE, 0x11, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x11, 0x00, 0x02, 0x00, 0x00,
    0x01, 0x00, 0x00, 0x68, 0x00, 0x00, 0x40, 0x29, 0x02, 0x40, 0x00, 0x00, 0x01, 0x01, 0x8A, 0x62, 0x10, 0x86, 0xA2, 0x41,
    0x00, 0x00, 0x00, 0x22, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00,
};

static const uint8_t kCAIL_DDI_CAPS_POLARIS22_A0Patched[] = {
    0x05, 0x00, 0x80, 0x00, 0xFE, 0x11, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x11, 0x00, 0x02, 0x00, 0x00,
    0x01, 0x00, 0x00, 0x68, 0x00, 0x00, 0x40, 0x29, 0x02, 0x40, 0x00, 0x00, 0x01, 0x01, 0x8A, 0x62, 0x10, 0x86, 0xA2, 0x41,
    0x00, 0x00, 0x00, 0x22, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00,
};

@osy Which patch manner do you prefer?

The binary patch for the function, or for the CAPS? And, should we keep using the old symbol and function cast method in Big Sur and below?

osy commented 3 years ago

Sorry, I made a mistake in my last post. It should be patched to 0x2 not 0x0 because it originally was 0x3 (bit 1 and 2 are set) and we only need to unset bit 1.

I think the CAPS patch is better because it's less likely for Apple to change in the future. We can keep it for all versions to be cleaner.

ghost commented 3 years ago

I have applied this latest CAPS patch to my fork of Polaris22Fixup and can confirm it works flawlessly on BigSur 11.5.1 and Monterey B4. @osy & @goodbest, thanks for the great detective work...