open-power / hostboot

System initialization firmware for Power systems
Apache License 2.0
75 stars 97 forks source link

Wrong L3 [d|i]-cache-sets on POWER9 #154

Open tuliom opened 5 years ago

tuliom commented 5 years ago

POWER9 has an L3 that is 20-way set associative. This changed from POWER8 that had L1, L2 and L3 all with 8-way associativity.

However, after extracting DTS from HDAT, this is what we get:

        l3-cache@30000008 {
...
            d-cache-sets = <0x8>;
            i-cache-sets = <0x8>;
            d-cache-size = <0xa00000>;
            i-cache-size = <0xa00000>;
        };

Notice both L1 and L2 (8-way associative) have correct numbers:

        PowerPC,POWER9@8 {
...
            d-cache-size = <0x8000>;
            i-cache-size = <0x8000>;
            i-cache-sets = <0x8>;
            d-cache-sets = <0x8>;
...
        };
...
        l2-cache@20000008 {
...
            d-cache-sets = <0x8>;
            i-cache-sets = <0x8>;
            d-cache-size = <0x80000>;
            i-cache-size = <0x80000>;
...
        };
tuliom commented 5 years ago

I was discussing this topic with @segher and we didn't agree on what [d|i]-cache-sets mean. However, regardless of the this discussion, the L3 value on POWER9 is wrong.

Option 1 - N-way associativity [d|i]-cache-sets means associativity N from N-way associative. In that case, P9 L3 [d|i]-cache-sets should have 0x14 (20-way associative). All the other values for P8 and P9 (including L1 and L2) are correct.

Option 2 - Number of sets [d|i]-cache-sets should store the number of sets, i.e. sets = size / (N * line_size) If this case is correct, P9 L3 [d|i]-cache-sets is expected to have 0x1000. All the P8 and P9 values (including L1, L2 and L3) also have to be re-calculated.

Notice the Devicetree specification v0.2 specifies the following for d-cache-sets:

Specifies the number of associativity sets in the data cache.

Source: https://github.com/devicetree-org/devicetree-specification/releases/tag/v0.2

dcrowell77 commented 5 years ago

We've got the same value in our PHYP box as well so it is odd nobody has noticed anything wrong for this long now. The change is very simple to make, literally 1 line of target_types.xml, just want to make sure we don't break anybody downstream.

dcrowell77 commented 5 years ago

The P9 user manual says:

The L1 D-cache is the first level of cache available to load and store operations. It is 32 KB and organized as 8-way set-associative with 128-byte cache lines.

Where are you getting the 20-way value from?

tuliom commented 5 years ago

@dcrowell77, L1I, L1D and L2 are indeed 8-way associative on P9.

But according to the P9 user manual, section 8.3 (L3 Cache - List of features and resources), L3 is 20-way associative.

Anyway, notice the discussion around the definition of [d|i]-cache-sets. There is a chance it's being misused.

dcrowell77 commented 5 years ago

The skiboot folks control the devtree part of the equation so I can't speak much to that. Hostboot fills in the HDAT field of "Number of associativity sets in the data cache." which doesn't specifically say L1/L2/L3. Adding @stewart-ibm @ozbenh

ozbenh commented 5 years ago

On Mon, 2018-12-03 at 19:47 +0000, Dan wrote:

The skiboot folks control the devtree part of the equation so I can't speak much to that. Hostboot fills in the HDAT field of "Number of associativity sets in the data cache." which doesn't specifically say L1/L2/L3. Adding @stewart-ibm @ozbenh

Offset 0x44 in Cache Size Structure in the CUDA has the L2 associativity, but not the L3, so we assume they are identical.

Cheers, Ben.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

tuliom commented 3 years ago

Hi, any updates on this issue?

dcrowell77 commented 3 years ago

Tagging @klauskiwi , based on where the discussion is headed it might make sense to move this to the skiboot list.

klauskiwi commented 3 years ago

@hegdevasant looks something for you...

hegdevasant commented 3 years ago

@hegdevasant looks something for you...

Sorry. Somehow I missed this one.. As BenH mentioned earlier , we just assume same associativity for L2 and L3.. as we are not getting L3 associativity information from HDAT.

From skiboot code :

213 
214         /* Assume cache associavitity sets is same for L2, L3 and L3.5 */
215         dt_add_property_cells(node, "d-cache-sets",
216                               be32_to_cpu(cache->l2_cache_assoc_sets));
217         dt_add_property_cells(node, "i-cache-sets",
218                               be32_to_cpu(cache->l2_cache_assoc_sets));

I think fix should come from HDAT side..

-Vasant

dcrowell77 commented 3 years ago

@hegdevasant Are you proposing that HDAT needs a new field or that the current values are incorrect?

hegdevasant commented 3 years ago

@dcrowell77

@hegdevasant Are you proposing that HDAT needs a new field or that the current values are incorrect?

I think we need new field in HDAT for L3 associativity. Currently we are not getting that. Hence we just use L2 associativity for L3. So on P9 we are displaying L3 associativity as 8

-Vasant