clamchowder / Microbenchmarks

Trying to figure various CPU things out
Apache License 2.0
77 stars 13 forks source link

ReturnStackTest may need extra alignment on Golden Cove #14

Closed jiegec closed 1 month ago

jiegec commented 3 months ago

In Popping the Hood on Golden Cove, the return prediction behavior of Golden Cove seems weird: no clear jump up observed. This is possibly due to too many call/ret pairs in the same cache line. If an extra alignment directive is added so that the functions reside in different cache lines:

diff --git a/AsmGen/tests/ReturnStackTest.cs b/AsmGen/tests/ReturnStackTest.cs
index d940393..9d768cd 100644
--- a/AsmGen/tests/ReturnStackTest.cs
+++ b/AsmGen/tests/ReturnStackTest.cs
@@ -34,6 +34,7 @@ namespace AsmGen
                 {
                     string funcName = GetFunctionName(callDepth, callIdx);
                     sb.AppendLine($".global {funcName}");
+                    sb.AppendLine($".align 64");
                     sb.AppendLine($"{funcName}:");
                     if (callIdx < callDepth - 1)
                     {

We can observe a clear jump up at ~20 calls:

Return Stack Depth Test:
1,0.508000
2,0.168000
3,0.172000
4,0.166000
5,0.166000
6,0.198000
7,0.228000
8,0.186000
9,0.184000
10,0.182000
11,0.182000
12,0.180000
13,0.178000
14,0.190000
15,0.188000
16,0.188000
17,0.186000
18,0.184000
19,0.184000
20,0.182000
21,0.406000
22,0.440000
23,0.528000

This looks promising, since Sunny Cove has 17-entry RAS.

edisonchan commented 3 months ago

Currently, the test results from master branch seems to be very different from the results in the CNC's articles . image

jiegec commented 3 months ago

Currently, the test results from master branch seems to be very different from the results in the CNC's articles . image

Yes, the results after adding the extra alignment are like:

image

clamchowder commented 3 months ago

Thanks, alignment indeed matters here, and I see 20 return stack entries on Redwood Cove. I'll get this into the master branch. Currently I'm a bit busy with other changes

clamchowder commented 1 month ago

Increased padding between generated test functions and calls/returns