Open klauspost opened 4 years ago
Ah, no you are not missing something obvious. Sorry there is not an easy way to do that right now.
I'll need to think about how to support that. PeachPy has an explicit representation of microarchitecture, where each one has a potentially different register set. Something like that could work.
Hack, that "reserves" the AVX512 registers. Maybe use while debugging only, though impact should be minimal.
// reserveExtended will reserve extended registers, but not use them.
// Call at the beginning of the function and call the returned function
// just before VZEROUPPER/RET.
func reserveExtended() func() {
tmp := GP64()
XORQ(tmp, tmp)
// Always jump
JZ(LabelRef("skip_extended"))
extZMMs := []reg.VecPhysical{reg.Z16, reg.Z17, reg.Z18, reg.Z19, reg.Z20, reg.Z21, reg.Z22, reg.Z23, reg.Z24, reg.Z25, reg.Z26, reg.Z27, reg.Z28, reg.Z29, reg.Z30, reg.Z31}
for _, reg := range extZMMs {
MOVQ(tmp, reg.AsX())
}
Label("skip_extended")
return func() {
tmp := GP64()
XORQ(tmp, tmp)
// Always jump
JZ(LabelRef("skip_extended_end"))
for _, reg := range extZMMs {
MOVQ(reg.AsX(), tmp)
}
Label("skip_extended_end")
}
}
Is there any way to restrict YMM registers, so for instance AVX2 code can only use reg YMM0 -> YMM15?
Sorry if I am missing something obvious.