Closed twitchyliquid64 closed 5 years ago
Merging #120 into master will increase coverage by
0.04%
. The diff coverage is75.75%
.
@@ Coverage Diff @@
## master #120 +/- ##
==========================================
+ Coverage 65.98% 66.02% +0.04%
==========================================
Files 41 41
Lines 4066 4130 +64
==========================================
+ Hits 2683 2727 +44
- Misses 1116 1133 +17
- Partials 267 270 +3
Impacted Files | Coverage Δ | |
---|---|---|
exec/internal/compile/backend_amd64.go | 77.9% <75.75%> (-1.57%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update eda5438...64664a9. Read the comment docs.
ah, also: could you put the before/after speed improvements in one of the commit messages? (probably using the benchmarks and golang.org/x/perf/cmd/benchstat)
I've put the benchmark in the commit message, but I will note benchmarks will vary alot depending on the cache size and generation of processor (especially here, where we are emitting raw assembly).
For instance, the interpreted benchmark is roughly the same between my laptop (8th gen, 8mb cache) and desktop (3rd gen, 6mb cache), but the native execution benchmarks are all ~15% faster. This is largely attributable to a better uop cache, instruction fusion, larger instruction-fetch blocks, larger reorder buffer etc. These difference will get larger the further we move from reading/writing to memory (as is with the current approach, where intermediate values are written into the stack slice) to storing intermediates in registers.
Pairs of instructions that can be re-written into a single amd64 instruction (ie: basic operations that accept an immediate value as an operand) are rewritten into that form. For example, a
i64.Const
+i64.Add
produce a single instructionADDQ <register>, <immediate value>
.Opportunistically keep track of the size of the stack in a register rather than always reading from memory (The value is flushed if necessary in the postamble).
Opportunistically cache a pointer to the stack and/or local backing array in a register rather than always reading for memory. As this value is immutable we don't need to flush it back in the postamble.