mmcloughlin / avo

Generate x86 Assembly with Go
BSD 3-Clause "New" or "Revised" License
2.73k stars 89 forks source link

reg: restricting YMM register usage #146

Open klauspost opened 4 years ago

klauspost commented 4 years ago

Is there any way to restrict YMM registers, so for instance AVX2 code can only use reg YMM0 -> YMM15?

Sorry if I am missing something obvious.

mmcloughlin commented 4 years ago

Ah, no you are not missing something obvious. Sorry there is not an easy way to do that right now.

I'll need to think about how to support that. PeachPy has an explicit representation of microarchitecture, where each one has a potentially different register set. Something like that could work.

klauspost commented 1 year ago

Hack, that "reserves" the AVX512 registers. Maybe use while debugging only, though impact should be minimal.

// reserveExtended will reserve extended registers, but not use them.
// Call at the beginning of the function and call the returned function
// just before VZEROUPPER/RET.
func reserveExtended() func() {
    tmp := GP64()
    XORQ(tmp, tmp)
    // Always jump
    JZ(LabelRef("skip_extended"))
    extZMMs := []reg.VecPhysical{reg.Z16, reg.Z17, reg.Z18, reg.Z19, reg.Z20, reg.Z21, reg.Z22, reg.Z23, reg.Z24, reg.Z25, reg.Z26, reg.Z27, reg.Z28, reg.Z29, reg.Z30, reg.Z31}
    for _, reg := range extZMMs {
        MOVQ(tmp, reg.AsX())
    }
    Label("skip_extended")
    return func() {
        tmp := GP64()
        XORQ(tmp, tmp)
        // Always jump
        JZ(LabelRef("skip_extended_end"))
        for _, reg := range extZMMs {
            MOVQ(reg.AsX(), tmp)
        }
        Label("skip_extended_end")
    }
}