Generate x86 Assembly with Go
avo
makes high-performance Go assembly easier to write, review and maintain. The avo
package presents a familiar assembly-like interface that simplifies development without sacrificing performance:
avo
programs are Go programsavo
assigns physical registers for youFor more about avo
:
x86
Assembly Generation with Go" at dotGo 2019 (slides)filippo.io/edwards25519
assembly with avo
avo
avo
and general Go assembly topics in the #assembly channel of Gophers SlackNote: APIs subject to change while avo
is still in an experimental phase. You can use it to build real things but we suggest you pin a version with your package manager of choice.
Install avo
with go get
:
$ go get -u github.com/mmcloughlin/avo
avo
assembly generators are pure Go programs. Here's a function that adds two uint64
values:
//go:build ignore
package main
import . "github.com/mmcloughlin/avo/build"
func main() {
TEXT("Add", NOSPLIT, "func(x, y uint64) uint64")
Doc("Add adds x and y.")
x := Load(Param("x"), GP64())
y := Load(Param("y"), GP64())
ADDQ(x, y)
Store(y, ReturnIndex(0))
RET()
Generate()
}
go run
this code to see the assembly output. To integrate this into the rest of your Go package we recommend a go:generate
line to produce the assembly and the corresponding Go stub file.
//go:generate go run asm.go -out add.s -stubs stub.go
After running go generate
the add.s
file will contain the Go assembly.
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
#include "textflag.h"
// func Add(x uint64, y uint64) uint64
TEXT ·Add(SB), NOSPLIT, $0-24
MOVQ x+0(FP), AX
MOVQ y+8(FP), CX
ADDQ AX, CX
MOVQ CX, ret+16(FP)
RET
The same call will produce the stub file stub.go
which will enable the function to be called from your Go code.
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
package add
// Add adds x and y.
func Add(x uint64, y uint64) uint64
See the examples/add
directory for the complete working example.
See examples
for the full suite of examples.
Sum a slice of uint64
s:
func main() {
TEXT("Sum", NOSPLIT, "func(xs []uint64) uint64")
Doc("Sum returns the sum of the elements in xs.")
ptr := Load(Param("xs").Base(), GP64())
n := Load(Param("xs").Len(), GP64())
Comment("Initialize sum register to zero.")
s := GP64()
XORQ(s, s)
Label("loop")
Comment("Loop until zero bytes remain.")
CMPQ(n, Imm(0))
JE(LabelRef("done"))
Comment("Load from pointer and add to running sum.")
ADDQ(Mem{Base: ptr}, s)
Comment("Advance pointer, decrement byte count.")
ADDQ(Imm(8), ptr)
DECQ(n)
JMP(LabelRef("loop"))
Label("done")
Comment("Store sum to return value.")
Store(s, ReturnIndex(0))
RET()
Generate()
}
The result from this code generator is:
// Code generated by command: go run asm.go -out sum.s -stubs stub.go. DO NOT EDIT.
#include "textflag.h"
// func Sum(xs []uint64) uint64
TEXT ·Sum(SB), NOSPLIT, $0-32
MOVQ xs_base+0(FP), AX
MOVQ xs_len+8(FP), CX
// Initialize sum register to zero.
XORQ DX, DX
loop:
// Loop until zero bytes remain.
CMPQ CX, $0x00
JE done
// Load from pointer and add to running sum.
ADDQ (AX), DX
// Advance pointer, decrement byte count.
ADDQ $0x08, AX
DECQ CX
JMP loop
done:
// Store sum to return value.
MOVQ DX, ret+24(FP)
RET
Full example at examples/sum
.
For demonstrations of avo
features:
complex{64,128}
types.DATA
sections.Implementations of full algorithms:
StadtX
hash port from dgryski/go-stadtx.Popular projects[^projects] using avo
:
[^projects]: Projects drawn from the avo
third-party test suite. Popularity
estimated from Github star count collected on Nov 1, 2024.
golang / go :star: 123.8k
The Go programming language
klauspost / compress :star: 4.8k
Optimized Go Compression Packages
golang / crypto :star: 3k
[mirror] Go supplementary cryptography libraries
klauspost / reedsolomon :star: 1.9k
Reed-Solomon Erasure Coding in Go
bytedance / gopkg :star: 1.7k
Universal Utilities for Go
cloudflare / circl :star: 1.3k
CIRCL: Cloudflare Interoperable Reusable Cryptographic Library
segmentio / asm :star: 870
Go library providing algorithms optimized to leverage the characteristics of modern CPUs
zeebo / xxh3 :star: 406
XXH3 algorithm in Go
zeebo / blake3 :star: 398
Pure Go implementation of BLAKE3 with AVX2 and SSE4.1 acceleration
lukechampine / blake3 :star: 356
An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
See the full list of projects using avo
.
Contributions to avo
are welcome:
avo
in a real project is incredibly valuable. Consider porting an existing project to avo
.Inspired by the PeachPy and asmjit projects. Thanks to Damian Gryski for advice, and his extensive library of PeachPy Go projects.
avo
is available under the BSD 3-Clause License.