AMD-OSX / bugtracker

AMD OS X Bugtracker
13 stars 2 forks source link

AMD CPUs have incorrect package topology in XNU #112

Open vit9696 opened 3 years ago

vit9696 commented 3 years ago

Describe the bug macOS builds a CPU topology, i.e. the CPU structure, comprised of lcores, cores, dies, and packages:

Let's take AMD Ryzen 9 5900X 12-Core Processor. This CPU has 12 cores with the support for Hyper Threading, so it is supposed to have a list with 24 virtual cores and a list with 12 physical cores, each physical core linking to a list of 2 physical cores. Since AMD follows chiplet design, and each die is believed to have 6 cores, we also expect 2 dies pointing to a set of 6 physical and 12 virtual cores. Since this CPU is produced as a single socket and can only be installed into single socket motherboards, there can only be 1 package.

For some reason this is not what I see, and the current patches provide an inadequate CPU topology in XNU as shown in https://github.com/acidanthera/bugtracker/issues/1625#issuecomment-832533136.

To Reproduce

Let's take this simple code and execute it in the kernel:

        pmKextRegister(PM_DISPATCH_VERSION, NULL, &pmCallbacks);
        uint32_t cc = 0, pp = 0;
        auto pkg = pmCallbacks.GetPkgRoot();
        while (pkg != nullptr) {
            auto core = pkg->cores;
            while (core != nullptr) {
                cc++;
                core = core->next_in_pkg;
            }
            DBGLOG("rev", "calculated %u cores in pkg %u", cc, pp);
            pp++;
            pkg = pkg->next;
        }

Expected behavior

What we expect it print is:

calculated 12 cores in pkg 0

Actual behavior

Yet what does print is:

calculated 2 cores in pkg 0
calculated 12 cores in pkg 1

I.e. we have 2 packages, one package with 2 cores and another 10. In XNU terms that means we installed 2 physical AMD Ryzen 9 5900X 12-Core Processor CPUs on one board, but one has just the 2 cores working and another one has 10 cores working. This is plain wrong and makes no sense.

If applicable, add screenshots to help explain your problem.

System Version (please complete the following information):

System Information (please complete the following information):

Additional context

This is particularly terrific, because genuine configurations with multiple physical CPUs are indistinguishable, and in RestrictEvents I actually had to hardcode AMD CPUID checks to assume that only single CPU configurations are allowed, counting multiple packages as parts of single CPUs.

The issue was discovered with @simonintense, who can provide more information about his system if necessary. There are both OpenCore log and configuration in https://github.com/acidanthera/bugtracker/issues/1625#issuecomment-831602457.

trulyspinach commented 3 years ago

I’ve noticed this a while ago when writing the power management kext. If I remembered correct, XNU use APIC ID to generate the topology map. However, according to AMD’s documents and some of my experimentations they have non-continuous APIC ID unlike Intel’s, resulting in the incorrect map generated by XNU.