mmcloughlin / avo

Generate x86 Assembly with Go
BSD 3-Clause "New" or "Revised" License
2.7k stars 89 forks source link

Bug in avo -- MOVL should allow XMM registers as a destination. #436

Open davecgh opened 2 months ago

davecgh commented 2 months ago

In order to move a 32-bit value from memory into an SSE register, as discussed in this issue, the Go assembler requires the use of MOVL (which is different than the intel/at&t syntax which uses MOVD and chooses based on the source operand).

Go will do a 64-bit load with MOVD despite being a pointer to a 32-bit value. However, avo complains about bad operands when using MOVL since it's not mark as one of its accepted forms.

Here is a minimal example to show the incorrect behavior of MOVD and that MOVL is indeed the correct opcode:

foo.go:

//go:build ignore

package main

import (
    "github.com/mmcloughlin/avo/attr"
    . "github.com/mmcloughlin/avo/build"
)

func main() {
    TEXT("foo", attr.NOSPLIT, "func(n *uint32, out *[16]byte)")
    Pragma("noescape")

    n, _ := Dereference(Param("n")).Resolve()
    r := XMM()
    MOVD(n.Addr, r)

    out, _ := Dereference(Param("out")).Index(0).Resolve()
    MOVOU(r, out.Addr)
    RET()

    Generate()
}

main.go:

//go:generate go run foo.go -out foo_amd64.s -stubs foo_amd64.go -pkg main

package main

import "fmt"

func main() {
    n := uint32(0xaaaaaaaa)
    var out [16]byte
    foo(&n, &out)
    fmt.Printf("%032x\n", out)
}
$ go generate
$ go build && ./issue
aaaaaaaa9854c3000000000000000000     # notice the junk because it moved 64 bits
$ sed -i 's/MOVD/MOVL/' foo_amd64.s  # replace MOVD with MOVL in generated asm
$ go build && ./issue
aaaaaaaa000000000000000000000000     # correct result

It's also pretty evident in looking at the compiled code that the first case using MOVD generates an incorrect MOVQ:

$ go tool objdump -s foo issue
  foo_amd64.s:8         0x48ad40                488b442408              MOVQ 0x8(SP), AX
  foo_amd64.s:9         0x48ad45                f30f7e00                MOVQ 0(AX), X0         <--------- MOVQ
  foo_amd64.s:10        0x48ad49                488b442410              MOVQ 0x10(SP), AX
  foo_amd64.s:11        0x48ad4e                f30f7f00                MOVDQU X0, 0(AX)
  foo_amd64.s:12        0x48ad52                c3                      RET

Whereas MOVL generates the correct MOVD:

$ go tool objdump -s foo issue
  foo_amd64.s:8         0x48ad40                488b442408              MOVQ 0x8(SP), AX
  foo_amd64.s:9         0x48ad45                660f6e00                MOVD 0(AX), X0    <--------- MOVD
  foo_amd64.s:10        0x48ad49                488b442410              MOVQ 0x10(SP), AX
  foo_amd64.s:11        0x48ad4e                f30f7f00                MOVDQU X0, 0(AX)
  foo_amd64.s:12        0x48ad52                c3                      RET