golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.33k stars 17.7k forks source link

cmd/compile: loop optimization #24240

Open MichaelTJones opened 6 years ago

MichaelTJones commented 6 years ago

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

master (1.11 dev)

Does this issue reproduce with the latest release?

this is a compilation strategy proposal

What operating system and processor architecture are you using (go env)?

amd64 / Darwin (macbook pro)

What did you do?

I have code that iterates without referencing the loop variable. Changing that from:

for i := 0; i < n; i++ { // no access to i }

...to...

for i := -n; i < 0; i++ { // no access to i }

saves me 4%-5% in my test cases.

Looking at the .S files, I see that there are two gains (as expected): one is that the loop limit needs only be addressable and referenced at the start of the loop--which frees a register over the body of the loop; and also, because the test is against zero and there are dedicated instructions for that rather than loading immediate literals or otherwise.

It seems (from afar) that there could be a rule along the lines of:

if loop variable not referenced in body and assignment has block structure (i:=) and type of loop variable is signed and incr RHS does not depend on additive offset to variable (i++ ok, i*=2 not), then transform "for v := low; v OP high; incr" into "for v := low-high; v OP 0; incr"

I don't have any comprehensive data about how common this situation may be, but it does help.

What did you expect to see?

no change because i expected the compiler to be doing this.

What did you see instead?

5% speedup

rasky commented 6 years ago

Since you have the code handy, can you provide a working example that includes a benchmark?

MichaelTJones commented 6 years ago

i can't share that particular code...but i can and will build a test to share

On Sun, Mar 4, 2018 at 12:24 PM, Giovanni Bajo notifications@github.com wrote:

Since you have the code handy, can you provide a working example that includes a benchmark?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/24240#issuecomment-370260710, or mute the thread https://github.com/notifications/unsubscribe-auth/AHgypbiFJkzX-7c-COnX95uFPgAafJqXks5tbE1jgaJpZM4SbYI5 .

-- Michael T. Jones michael.jones@gmail.com

RalphCorderoy commented 6 years ago

When n is negated, if that's the style chosen to re-write the loop, then look out for n starting off as the smallest possible value as two's complement representation is asymmetrical.

RLH commented 6 years ago

Isn't the decrement of i followed by a jz (or jnz) another way to write this. for i := n; i != 0; i-- { }

RalphCorderoy commented 6 years ago

Writing assembly for various architectures, a pre-decrement test against zero is the normal way I'm used to seeing, yes. But I wanted to warn against a flaw in the OP's suggestion.

MichaelTJones commented 6 years ago

thank you for the advice.

On Sun, Mar 4, 2018 at 2:41 PM, Ralph Corderoy notifications@github.com wrote:

Writing assembly for various architectures, a pre-decrement test against zero is the normal way I'm used to seeing, yes. But I wanted to warn against a flaw in the OP's suggestion.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/24240#issuecomment-370271099, or mute the thread https://github.com/notifications/unsubscribe-auth/AHgypXq7fJVfP__QGo8lu4smKk9MLp6Fks5tbG18gaJpZM4SbYI5 .

-- Michael T. Jones michael.jones@gmail.com

RalphCorderoy commented 6 years ago

Something else to ponder in implementing this... I hear some folk use a debugger; might they try and inspect the loop's variable and be confused that it's changed sign, or running in the opposite direction? Or perhaps DWARF can represent the transformation for the debugger to reverse? If other compilers have implemented this for other languages then they might give ideas.

dgryski commented 6 years ago

OTOH, stepping through C code that's been built withgcc -O3 is pretty painful. It's fine as long as you're reading the assembly and not trying to match it up to the C code that produced it.

MichaelTJones commented 6 years ago

Ok, here is a simplified and extracted example. I called it "loop_test.go"

https://play.golang.org/p/uRdymTPtTCi

Alas, i don't see such a big difference in this case. The code is not the same, not so much register pressure.

Summary: maybe this is not so important.

RalphCorderoy commented 6 years ago

I think it's still worth trying to measure gains across more tests, especially as some platforms have more register pressure than others. But I've also a niggling recollection that this has been suggested before and turned down because of debugger confusion, probably by @rsc. :-)