Open lyrachord opened 6 years ago
This is one of those arguments I lost. :-) i.e. Where to put the rounding/sae decorations. I never liked the suggested solutions. I also never really liked the {k1} syntax for the mask operands; To me the mask operands are just another operand. XED puts the round/sae decoration with the mask operand. The thinking was: the rounding/sae happens when writing the destination so it kinda was in-line with the destination operand.
% obj/examples/xed -64 -d 62 91 7e 18 2c ee 62917E182CEE ICLASS: VCVTTSS2SI CATEGORY: CONVERT EXTENSION: AVX512EVEX IFORM: VCVTTSS2SI_GPR32i32_XMMf32_AVX512 ISA_SET: AVX512F_SCALAR SHORT: vcvttss2si ebp{sae}, xmm30
NASM makes it another "operand" after the final reg operand, including a comma before it: vcvttss2si ebp,xmm30,{sae}
I could probably tweak the disassembly printer to optionally to put the rounding/sae decorations after the final register operand. Then things would be accepted more directly by assemblers like nasm. I'll look at that. Thanks.
Thanks.
I did a tool based NASM database. I found it is convenient if we treat mask or mask|z as a prefix. Because {k1} {k1}{z} only applied to the first operand. And broadcast with sae similar and only applied to the last operand(or the last but one if the last is Imm)
Another point, we can limit the operands to T[4] if the mask(zeroing) operand prefixed. The T[4] operands of inst template will simplify the next processing.
A little bit offtopic, but:
Another point, we can limit the operands to T[4] if the mask(zeroing) operand prefixed.
There are stil instructions with more than 4 operands (vpermil2pd
/vpermil2ps
) in the AMD XOP ISA-extension.
Yes. We can take it as a prefix as well, just like MASK. So it is designed for extended usage. And the low needle of the imm byte maybe has real inst-specific payload in future of VEX, I guess.
Another one: The 2 decorations SAESTR and ROUNDC strictly correspond to SAE() and AVX512_ROUND() in FIX_ROUND_LEN128/512 situations. Just remove them from operands, let pattern take care. And since it's just a decoration, maybe user can specify it by his preference as he wish. Again, it behaves like a prefix.
I try to make me clear. I focus on the position of SAESTR of XED. NASM use notations |mask, |mask|z, |bc32/bc64|er/sae, |er/sae to attach to the main operand. XED only have a single TXT notation. So when the place held by BCASTSTR, the second decoration sae or rounding needs another position. Besides the current way, which moves sae or rounding to attach to the first operand, there are another 2 solutions: 1, prefixize it as stated before. 2, merge bc and 2 rounding controls into on group by matrix way, BC, BCSAE, BCROUND, SAE, ROUNDC. For me, +1 for solution 1
Hi. I cannot omit the TXT thing from the operands as I think you suggested; That is what drives the printing of the embedded rounding string. I'm just printing it in the wrong place. Wrong might be too strong a word as well as it is just a convention (and a bad one IMHO).
Making it a prefix is a nice idea but far too late for that and other assemblers have converged on a different notation.
My plan is to modify the printer so that the embedded rounding is printed after the last operand. I think that's what everyone is doing. This was the argument that I said I lost.
Speaking of syntaxes and other assemblers..
Go assembler AVX512 currently uses suffixes to enable/disable EVEX features of instruction like zeroing (Z
), broadcasting (BCST
) or sae/rc.
Rationale: avoids additional {}
syntax for x86 and re-uses opcode suffixes that are used in ARM. Parsing is also simpler.
Examples:
// Embedded rounding + optional zeroing.
VADDPD.RU_SAE Z3, Z2, K1, Z1 // 62f1ed5958cb
VADDPD.RD_SAE Z3, Z2, K1, Z1 // 62f1ed3958cb
VADDPD.RZ_SAE Z3, Z2, K1, Z1 // 62f1ed7958cb
VADDPD.RN_SAE Z3, Z2, K1, Z1 // 62f1ed1958cb
VADDPD.RU_SAE.Z Z3, Z2, K1, Z1 // 62f1edd958cb
VADDPD.RD_SAE.Z Z3, Z2, K1, Z1 // 62f1edb958cb
VADDPD.RZ_SAE.Z Z3, Z2, K1, Z1 // 62f1edf958cb
VADDPD.RN_SAE.Z Z3, Z2, K1, Z1 // 62f1ed9958cb
// Embedded broadcasting + optional zeroing.
VADDPD.BCST (AX), X2, K1, X1 // 62f1ed195808
VADDPD.BCST.Z (AX), X2, K1, X1 // 62f1ed995808
VADDPD.BCST (AX), Y2, K1, Y1 // 62f1ed395808
VADDPD.BCST.Z (AX), Y2, K1, Y1 // 62f1edb95808
VADDPD.BCST (AX), Z2, K1, Z1 // 62f1ed595808
VADDPD.BCST.Z (AX), Z2, K1, Z1 // 62f1edd95808
// Surpress all exceptions (SAE).
VMAXPD.SAE Z3, Z2, K1, Z1 // 62f1ed595fcb or 62f1ed195fcb
VCMPSD.SAE $0, X0, X2, K0 // 62f1ef18c2c000
VMAXPD (AX), Z2, K1, Z1 // 62f1ed495f08
// Multisource operands (4FMAPS/4VNNIW register range operand).
VP4DPWSSD (AX), [Z0-Z3], K1, Z7 // 62f27f495238
VP4DPWSSD 7(DX), [Z0-Z3], K1, Z7 // 62f27f4952ba07000000
// K write mask. Optional operand.
VADDPD X30, X1, X0 // 6291f50858c6
VADDPD X2, X1, K1, X0 // 62f1f50958c2
The full list of valid suffix combinations is:
Z
SAE
SAE.Z
RN_SAE
RZ_SAE
RD_SAE
RU_SAE
RN_SAE.Z
RZ_SAE.Z
RD_SAE.Z
RU_SAE.Z
BCST
BCST.Z
here: REG0=MASK_R():w:mskw:TXT=SAESTR
Seems the TXT suffix should be at REG3.
And there are 3 other insts contain mask register with sae flag