Barebit / x86reference

X86 Opcode and Instruction Reference
http://ref.x86asm.net
GNU Lesser General Public License v3.0
234 stars 52 forks source link

MOV need to be splitted to mem and nomem syntaxes #52

Closed Kashio closed 3 months ago

Kashio commented 1 year ago

Right now MOV encoded with opcode 8E use operand type w for its operand of addressing E which is defined as:

Word, regardless of operand-size attribute (for example, ENTER).

According to the intel docs:

        8E /r MOV Sreg,r/m16** RM Valid Valid Move r/m16 to segment register.
REX.W + 8E /r MOV Sreg,r/m64** RM Valid Valid Move lower 16 bits of r/m64 to segment register.

When dealing with memory addressing the operand always points to word pointer as expected, but when dealing with register addressing, one need to specify the full register name even tho the instruction only uses the lower 16 bit of the register, thus I think the appropriate solution would be to split the syntax to mem and nomem attributes, when the mem one has operand type w and the nomem one has operand type v because the operand size prefix can affect the register in use according to my testing on objdump which is defined as::

Word or doubleword, depending on operand-size attribute (for example, INC (40), PUSH (50)).

For 64 bit mode the operand type should be vqp which is defined as:

Word or doubleword, depending on operand-size attribute, or quadword, promoted by REX.W in 64-bit mode.

EDIT: clarity

BarebitOpenSource commented 9 months ago

Good catch. However, it seems like objdump doesn't follow the syntax defined in Intel manual. The operand is either 16-bit or 64-bit, never 32-bit one, according to the manual.

In 64-bit mode, the register is either 16-bit or 64-bit, depending on REX.W. It seems like we need a new type wqp: "Word, or quadword, promoted by REX.W in 64-bit mode".

And there's similar issue with 8C MOV Rvqp, Sw. It should be 8C MOV Rwqp, Sw.

BarebitOpenSource commented 3 months ago

Moved to https://github.com/mazegen/x86reference/issues/19